# MRAM-based Stochastic Oscillators for Adaptive Non-Uniform Sampling of Sparse Signals in IoT Applications

Soheil Salehi<sup>1</sup>, Alireza Zaeemzadeh<sup>2</sup>, Adrian Tatulian<sup>3</sup>, Nazanin Rahnavard<sup>4</sup>, and Ronald F. DeMara<sup>5</sup>

Department of Electrical and Computer Engineering

University of Central Florida

Orlando, FL, 32816-2362, USA

{<sup>1</sup>soheil.salehi,<sup>3</sup>jcmaxwell}@knights.ucf.edu, {<sup>2</sup>zaeemzadeh,<sup>4</sup>nazanin}@eecs.ucf.edu, <sup>5</sup>ronald.demara@ucf.edu

Abstract-Recent advances to hardware integration and realization of highly-efficient Compressive Sensing (CS) approaches have inspired novel circuit and architectural-level approaches. These embrace the challenge to design more optimal nonuniform CS solutions that consider device-level constraints for IoT applications wherein lifetime energy, device area, and manufacturing costs are highly-constrained, but meanwhile the sensing environment is rapidly changing. In this manuscript, we develop a novel adaptive hardware-based approach for non-uniform compressive sampling of sparse and time-varying signals. The proposed Adaptive Sampling of Sparse IoT signals via STochastic-oscillators (ASSIST) approach intelligently generates the CS measurement matrix by distributing the sensing energy among coefficients by considering the signal characteristics such as sparsity rate and noise level obtained in the previous time step. In our proposed approach, Magnetic Random Access Memory (MRAM)-based stochastic oscillators are utilized to generate the random bitstreams used in the CS measurement matrix. SPICE and MATLAB circuit-algorithm simulation results indicate that ASSIST efficiently achieves the desired non-uniform recovery of the original signals with varying sparsity rates and noise levels.

Index Terms—Adaptive Compressive Sensing, Non-Uniform Compressive Sensing, MRAM-based Stochastic Oscillator.

### I. INTRODUCTION

Researchers have recently expanded their efforts to maximize the signal sensing and reconstruction performance while reducing energy consumption for Internet of Things (IoT) applications such as sensors and mobile devices [1], [2]. Recently, Compressive Sensing (CS) has been proposed as a sampling technique aimed at reducing the number of samples taken per frame to decrease energy, storage, and data transmission overheads. CS can be used to sample spectrallysparse wide-band signals close to the information rate rather than the Nyquist rate, which can alleviate the high cost of hardware performing sampling at high Nyquist rates [3]–[6].

Implementing non-uniform CS in hardware requires a random number generator (RNG) since CS theory assumes random sampling of data [4]. RNGs can be divided into two classes: true RNGs (TRNGs) and pseudo-RNGs (PRNGs). PRNGs include Linear Feedback Shift Registers (LFSR), which begin with a seed value and then continuously update this value by means of a linear function in order to create the illusion of randomness; such designs can suffer from limited quality in the randomness of the output as well as high energy and area [7]. TRNGs, on the other hand, rely on truly random events such as thermal noise, oscillator jitter, and metastability; TRNG designs can be challenged by limited generation speed as well as post-processing requirements which impose area and power overheads [8].

Previous attempts at TRNG design using spintronics have included use of bistable superparamagnetic tunnel junctions [7], application of sub-threshold voltages for stochastic switching in magnetic tunnel junctions (MTJs) [8] [9], use of MTJ stack arrangements for precessional switching [10], and by means of the voltage-controlled magnetic anisotropy (VCMA) effect [11]. While these designs have been effective in their quality of randomness, they have also involved relatively complex hardware resulting in power and area overhead. Thus, a spin-based TRNG is sought to minimize the power dissipation and area. Furthermore, previous works on non-uniform compressive sensing have been implemented using Complementary Metal Oxide Semiconductor (CMOS) technology [12], [13].

Herein, we propose a spin-based non-uniform compressive sensing circuit-algorithm solution that considers the signaldependent constraint as well as hardware limitations called *Adaptive Sampling of Sparse IoT signals via STochasticoscillators (ASSIST)*. The proposed ASSIST approach utilizes Magnetic Random Access Memory (MRAM)-based Stochastic Oscillator (MSO) devices as the main element in TRNGs, which offer miniaturization and significant energy savings [2], [14]. Additionally, MRAM-based Non-Volatile Memory (NVM) is used to store the output of TRNGs, which are the elements of the CS measurement matrix.

The remainder of this paper is organized as follows. Background and related work is provided in Section II. In Section III-A, a detailed description of the MRAM-based Non-Volatile Memory (NVM) devices is given. Then, in Section III-B MRAM-based stochastic bitstream generators used to develop the proposed ASSIST circuit are described. Finally, the proposed ASSIST circuit and architecture are elaborated in Section III-C. Section IV provides the simulation results and comparisons. Finally, Section V concludes this paper while demonstrating the cooperating benefits of emerging and spintronic devices within signal processing applications for IoT.

978-1-7281-3391-1/19/\$31.00 ©2019 IEEE DOI 10.1109/ISVLSI.2019.00079



#### II. BACKGROUND AND RELATED WORK

Compressive sensing (CS) is a technique for reconstructing a sparse signal of length N using M measurements, with  $M \ll N$ . The signal is said to be k-sparse if it has at most k non-zero entries in a given basis; the sparsity rate of the signal is defined as  $\frac{K}{N}$ . The measurement vector  $\mathbf{y} \in \mathbb{R}^M$ is related to the signal vector  $\mathbf{x} \in \mathbb{R}^N$  by the measurement matrix  $\mathbf{\Phi} \in \mathbb{R}^{M \times N}$  through the relation  $\mathbf{y} = \mathbf{\Phi} \mathbf{x}$ . While this is an undetermined system with infinitely many solutions, it has been shown that the signal  $\mathbf{x}$  can still be recovered from the M measurements by solving the basis pursuit problem:

$$\hat{\mathbf{x}} = \arg \min \|\mathbf{x}\|_1$$
 s.t.  $\mathbf{y} = \mathbf{\Phi}\mathbf{x}$  (1)

where  $\|\mathbf{x}\|_1 = \sum_i |\mathbf{x}|$ . It has been shown that  $\hat{\mathbf{x}}$  reconstructs the original signal vector if  $\boldsymbol{\Phi}$  satisfies a special condition known as the Restricted Isometry Property (RIP). An  $M \times N$ matrix  $\boldsymbol{\Phi}$  satisfies RIP(p) if for any k-sparse vector  $\mathbf{x}$ :

$$\|\mathbf{x}\|_{p} (1-\delta) \le \|\Phi\mathbf{x}\|_{p} \le \|\mathbf{x}\|_{p} (1+\delta), \quad 0 < \delta < 1$$
 (2)

In real-world applications, signals may contain special Regions of Interest (RoI), i.e., subsections of the signal which are more critical to accurately reconstruct than the rest of the signal [5], [6]. Moreover, the sparsity of the signal may be non-uniform. Use of a non-uniform measurement matrix allows RoI and parts of the signal with higher sparsity rates to be sampled with higher frequency (i.e., sampled with a sub-matrix containing a higher density of ones). It has been verified that non-uniform measurement matrices satisfy the RIP condition and therefore may be used for sparse signal sampling [5], [6].

Prior work on sparse measurement matrices includes Gilbert and Indyk [15] who described several CS recovery algorithms using sparse measurement matrices and Jafarpour et al. [16] who introduced an efficient and low-complexity sparse recovery algorithm. In addition, Kung et al. [17] introduced the concept of neighbor-weighted decoding as a means of partitioned compressive sensing, i.e. partitioning a signal into blocks which can then be decoded in parallel [18]. Gan [19] proposed to have blocks in the measurement matrix correspond to independent parts of the signal. While [17] and [19] do not take signal non-uniformity into account, Yu et al. [20] proposed saliency-based compressive sensing for image processing, where pixels are divided into blocks and the number of measurements applied to a block depends on the saliency of the pixels in that block. Different schemes for non-uniform measurement matrix design have also been reported in [21] and [22].

Recently researchers have achieved significant performance improvements using sparse signal recovery techniques. Spectrally sparse signals are utilized in many applications such as frequency hopping communications, musical audio signals, cognitive radio networks, and radar/sonar imaging systems [1], [23]. The cornerstone to achieving high-accuracy and efficient CS recovery approaches and nonuniform sampling techniques is the utilization of an adaptive



Fig. 1: 3-terminal SHE-MTJ structure, Right: Anti-parallel (high resistance), Left: Parallel (low resistance).

measurement matrix that changes according to the signal characteristics extracted from previous time frames [5], [6]. In most cases, hardware used to implement non-uniform CS sampling and recovery requires a large number of CMOS transistors and incurs significant area overhead and power dissipation [12], [13]. Herein, we propose a low-complexity hardware design to achieve significant power dissipation and area reduction compared to other designs proposed in the literature.

## III. ADAPTIVE SAMPLING OF SPARSE IOT SIGNALS VIA STOCHASTIC-OSCILLATORS (ASSIST)

### A. MRAM-based NVM for Storing CS Measurement Matrix

Researchers have focused on exploring the use of Spin-Hall Effect Magnetic Random Access Memory (SHE-MRAM) due to its high reliability and reduced delay for write operation [24]. SHE-MRAM devices consist of Magnetic Tunnel Junctions (MTJs) that are constructed of two ferromagnetic layers, called free layer and fixed layer, and a thin oxide layer, as well as a Heavy Metal (HM) strip as shown in Fig. 1 [25]. A bidirectional charge current through terminals of the HM, B and C, will generate a spin current that passes through the MTJ device in order to modify the polarization of the free layer to represent: 1) high resistance or Anti-Parallel (AP) state, and 2) low resistance or Parallel (P) state, as depicted in Fig. 1. The states of the MTJ are determined according to the angle,  $\theta$ , between the magnetization orientation of the ferromagnetic layers. The resistance values of the MTJs in the P and AP states are obtained using (3) and (4) [25]:

$$R(\theta) = \begin{cases} R_P = R_{MTJ}, & \theta = 0\\ R_{AP} = R_{MTJ}(1 + TMR), & \theta = \pi \end{cases}$$
(3)

$$TMR(T, V_b) = \frac{2P^2(1 - \alpha_{sp}T^{3/2})^2}{1 - P^2(1 - \alpha_{sp}T^{3/2})^2} \cdot \frac{1}{1 + (\frac{V_b}{V_c})^2}$$
(4)

where  $R_{MTJ} = RA/Area$ ,  $V_b$  is the bias voltage,  $V_0$  is a fitting parameter, and  $\alpha_{sp}$  is a material-dependent constant.

SHE-MRAM provides separate read and write paths, which increases the reliability due to a reduction in errors caused by read disturbance, while consuming significantly less energy [25]. The critical spin current required for switching the free layer magnetization orientation is expressed by (5) [24]:

$$I_{S,critical} = 2q\alpha M_S V_{MTJ} \left( H_k + 2\pi M_S \right) / \overline{h} \tag{5}$$

TABLE I: Parameters of the 3-terminal SHE-MTJ device.

| Parameter      | Description                           | Value                           |
|----------------|---------------------------------------|---------------------------------|
| $MTJ_{Area}$   | $l_{MTJ} \times w_{MTJ} \times \pi/4$ | $60nm \times 30nm \times \pi/4$ |
| $HM_{Volume}$  | $l_{HM} \times w_{HM} \times t_{HM}$  | $100nm \times 60nm \times 3nm$  |
| $t_f$          | Free Layer thickness                  | 1.3 nm                          |
| RA             | MTJ resistance-area product           | $9 \ \Omega \cdot \mu m^2$      |
| T              | Temperature                           | 358 K                           |
| $\alpha$       | Gilbert Damping factor                | 0.007                           |
| P              | Spin Polarization                     | 0.52                            |
| $\theta_{SHE}$ | Spin Hall Angle                       | 0.4                             |
| $\rho_{HM}$    | HM Resistivity                        | $200\mu\Omega.cm$               |
| $\lambda_{sf}$ | Spin Flip Length                      | 1.5nm                           |



Fig. 2: The building block of the proposed MRAM-based Stochastic Oscillator (MSO) [14].

where  $V_{MTJ}$  is the MTJ free layer volume. The relation between SHE-MTJ switching time and the voltage applied to the HM terminals is shown in (6), in which the critical voltage,  $v_c$ , is given by (7) [24].

 $\tau_S$ 

$$\tau_{HE} = \frac{\tau_0 ln \left(\pi/2\theta_0\right)}{\left(\frac{v}{v_c}\right) - 1} \tag{6}$$

$$\nu_c = \frac{8\rho I_c}{\theta_{SHE} \left[1 - sech\left(\frac{HM_{thick}}{\lambda_{sf}}\right)\right] \pi HM_{length}} \tag{7}$$

$$\theta_0 = \sqrt{\left(\frac{k_B}{2E_b}\right)} \tag{8}$$

where,  $\theta_0$  is the effect of stochastic variation,  $E_b$  is the thermal barrier of the magnet of volume V,  $HM_{length}$  is the length of the HM, and  $I_C$  is the critical charge current for spin-torque induced switching. In order to model the SHE-MTJ, the HM resistance is also required, which is expressed by (9), where  $\rho_{HM}$  is the electrical resistivity of HM.

$$R_{HM} = \left(\rho_{HM}.HM_{length}\right)/HM_{width} \times HM_{thick} \tag{9}$$

# B. MRAM-based Stochastic Bitstream Generator

Recently, researchers have studied theoretically and experimentally the utilization of thermally unstable superparamagnetic MTJs to realize a variety of functional spintronic devices [14], [26], [27]. Herein, we intend to demonstrate that a recently proposed building block with embedded MRAM technology can enable the hardware realization of a stochastic bitstream generator. The structure of the MRAM-based Stochastic Oscillator (MSO) is depicted in Fig. 2.

Due to the low energy-barrier (i.e.  $E_B \ll 40kT$ ), the MTJ's resistance level fluctuates between the two resistance states of  $R_{AP}$  and  $R_P$ , which results in the non-uniform stochastic output at the drain of the NMOS transistor shown

TABLE II: Modeling and Simulation Parameters [14].

| Parameters                               | Value               |
|------------------------------------------|---------------------|
| Saturation magnetization (CoFeB) $(M_s)$ | 1100emu/cc          |
| Free Layer diameter, thickness           | 22nm, 2nm           |
| Polarization                             | 0.59                |
| TMR                                      | 110%                |
| MTJ RA-product                           | $9\Omega - \mu m^2$ |
| Damping coefficient                      | 0.01                |
| Temperature                              | $26.85^{\circ}C$    |

in Fig. 2. We can amplify the NMOS drain output to provide full-swing signal, i.e.  $[0.0 \rightarrow 0.8]$ V, using a single inverter circuit. The probability of the output being '1' can be controlled using the input signal connected to the gate of the NMOS transistor. Thus, by increasing the gate voltage of the NMOS transistor,  $V_{IN}$ , its drain-source resistance,  $r_{ds}$ , will decrease, which will result in the drain voltage to be closer to the *GND*. On the other hand, by decreasing the gate voltage of the NMOS transistor,  $V_{IN}$ , its drain-source resistance,  $r_{ds}$ , will increase, which will result in the drain voltage to be closer to the *GND*. On the other hand, by decreasing the gate voltage of the NMOS transistor,  $V_{IN}$ , its drain-source resistance,  $r_{ds}$ , will increase, which will result in the drain voltage to be closer to the *NDD*. Considering the MTJ conductance of the MSO, we can observe the behavior of the circuit shown in Fig. 2 [14]:

$$G_{MTJ} = G_0 \left[ 1 + m_z \frac{TMR}{(2 + TMR)} \right] \tag{10}$$

where  $m_z$  is the free layer magnetization,  $G_0$  is the average MTJ conductance,  $(G_P + G_{AP})/2$ , and TMR is the tunneling magnetoresistance ratio. The drain voltage of the NMOS transistor shown in Fig. 2 can be expressed as:

$$V_{DRAIN}/V_{DD} = \frac{(2 + TMR) + TMR m_z}{(2 + TMR)(1 + \alpha) + TMR m_z}$$
(11)

where  $\alpha$  is the ratio of the transistor conductance,  $G_T$ , to the average MTJ conductance,  $G_0$ . When  $\alpha \approx 1$  maximum fluctuations can be achieved. This means, when  $V_{IN} = V_{DD}/2$ , the MTJ resistance is approximately equal to  $r_{ds}$ . In this paper, we use a circular nanomagnet with near-zero energy barrier without shape anisotropy. Such magnets have been fabricated and characterized in [28]. We use the embedded MRAM-based model developed in [14] to perform SPICE circuit simulations using the parameters listed in Table II and the nominal voltage of  $V_{DD} = 0.8$ . The relation between the probability of output being '1' and  $V_{IN}$  is depicted in Fig. 3(a), where  $V_{IN} = V_{DD}/2 = 400$ mV generates an output probability of 50%, as shown in Fig. 3(b).

## C. ASSIST Circuit-Architecture Solution

The proposed MRAM-based stochastic bitstream generator circuit is depicted in Fig. 4(a), wherein a 2-terminal low energy-barrier thermally unstable MTJ is utilized. As shown in Fig. 4(a), the output of the MSO is connected to a D-Flip-Flip (D-FF) which is controlled by a Power-Gated Clock (PG-CLK). This will provide control over the number of stochastic outputs provided by the MSO. In other words, by setting the duration of PG-CLK to run for M clock cycles, we would have a stochastic bitstream output,  $V_M$ , with the length of M bits, as shown in Fig. 4(a). Additionally, having



Fig. 3: (a) Output probability of MSO building block for ASSIST versus its input voltage, (b) The output and sampled output voltages for  $V_{IN} = 0.5V_{DD} = 400$ mV.

control over  $V_N$  enables us to adaptively adjust the number of '1's that appear in the output bitstream,  $V_M$ . As shown in Fig. 4(c), we have utilized a complementary SHE-MRAM array to store the elements of the measurement matrix and for each column of the measurement matrix we have used an MRAM-based stochastic bitstream generator. Thus, in order to adaptively change the number of rows in the measurement matrix to account for increased sparsity rate, we can adjust  $V_M$  accordingly to increase the number of measurements. Furthermore, in order to increase accuracy of the signal recovery, we can increase  $V_N$  of the MRAM-based stochastic bitstream generators located in the columns corresponding to the RoI to maintain more '1's in the measurement matrix. It is worth noting that in order to use the MRAM-based stochastic bitstream generator output to write into the SHE-MRAM bit-cells, the PG-CLK clock cycle should be long enough for the write current to flow through the HM of the SHE-MTJs.

As mentioned earlier, we utilize the non-volatile complementary SHE-MRAM array, which will result in a wide read margin and increases reliability of the read operation [25]. Additionally, using a non-volatile complementary SHE-MRAM array enables a clockless read operation that is rapid, reliable, and energy-efficient. In order to use the MSO to write into the SHE-MRAM bit-cells, we utilize the circuit shown in Fig. 4(b). Every column of the SHE-MRAM array shown in Fig. 4(c) is populated using a separate MSO shown in Fig. 4(a).

In order to write into each memory cell, WWL should be asserted to enable the write Transmission Gates (TGs), TGW. Then by setting Bit Line, BL, and Source Line, SL, we can write complementary data values in MTJ and  $\overline{MTJ}$ . Additionally, in order to use the MSO to write into the SHE-MTJ devices, the output of the D-FF is connected to the write NMOS transistor, NW. Thus, if the output of the D-FF is '1', then NW is turned on and will result in a current passing through the SHE-MTJs. On the other hand, if the output of the D-FF is '0', then NW won't turn on and the contents of the SHE-MTJs will remain untouched. To read the data stored in the SHE-MTJs, RWL is asserted, which turns on the read TG, TGR. Additionally, the read transistors, PR and NR, are enabled. Thus, by applying VDD at BL and GND at SL, a read path from VDD to GND is formed. This will lead to a voltage divider circuit and by connecting the node between the complementary SHE-MTJs,  $\mathbf{D}_{out}$ , to two inverter logic gates, the output voltage will be amplified and presented at the output node, OUT.

## **IV. SIMULATION RESULTS**

In order to evaluate and validate the behavior and functionality of the proposed ASSIST approach, SPICE and MATLAB simulations were performed. We have utilized the with 14nm HP-FinFET Predictive Technology Model (PTM) library as well as the MSO device model and parameters represented in [14] along with other circuit parameters and constants listed in Table I and Table II in our simulations to implement and evaluate the proposed ASSIST approach.

According to our simulation results, power dissipation of the stochastic bitstream generator circuit is  $23\mu$ W on average over a period of 100ns for generating a 100-bit bitstream composed of equal likelyhood for '0's and '1's. Furthermore, the area estimate of each stochastic bitstream generator circuit in the 14nm technology node according to the transistor count is  $0.4\mu$ m<sup>2</sup>. For a more equitable comparison in terms of area and energy consumption per bit, we have derived (12) and (13) considering general scaling method [29] to normalize the energy consumption per bit and area of the designs listed in Table III. Based on the general scaling method, voltage and area scale at different rates of U and S, respectively. Thus, the energy consumption is scaled with respect to  $1/SU^2$  and area per device is scaled according to  $1/S^2$  [29].

$$Energy_{norm} = \frac{Energy_x}{Energy_{MSO}} \times (\frac{1}{S}) \times (\frac{1}{U})^2$$
$$= \frac{Energy_x}{Energy_{MSO}} \times (\frac{14nm}{Technology}) \times (\frac{0.8V}{V_{DD}})^2$$
(12)

$$Area_{norm} = \frac{Area_x}{Area_{MSO}} \times (\frac{1}{S})^2$$
$$= \frac{Area_x}{Area_{MSO}} \times (\frac{14nm}{Technology})^2$$
(13)

where,  $V_{DD}$  is the nominal voltage of the technology model, *Technology* refers to the technology node in nanometers, and subscript x refers to the design that we want to scale its power dissipation and area according to the technology



(c)

Fig. 4: The proposed ASSIST approach, where (a) depicts the stochastic bitstream generator circuit, (b) shows a complementary MTJ memory bit-cell connected to the stochastic bitstream generator, and (c) illustrates the architecture view.

TABLE III: Comparison with recent TRNG designs

| Design    | Technology $(V_{DD})$ | <b>Energy</b> <sub>norm</sub> | Areanorm |
|-----------|-----------------------|-------------------------------|----------|
| [7]       | 28nm (1.0V)           | 0.3X                          | 1.25X    |
| [8]       | 28nm (1.0V)           | 8.9X                          | 4.8X     |
| [9]       | 28nm (1.0V)           | 17.4X                         | 3.7X     |
| This Work | 14nm (0.8V)           | 1X                            | 1X       |

models. According to (12) and (13), MSO reduces energy consumption per bit by  $\sim$  9-fold on average compared to the state-of-the-art TRNGs as listed in Table III. Additionally, MSO offers up to  $\sim$  3-fold area reduction on average compared to the TRNG designs provided in Table III using the scaling comparison trends accepted in the literature.

Furthermore, transient output of a single complementary SHE-MRAM NVM bit-cell shown in Fig. 4(b) is provided in Fig. 5. According to our simulation results, writing in a NVM bit-cell requires 155.2fJ on average while reading the content of a NVM bit-cell requires 21.9fJ on average. Additionally, based on our simulation results, the standby energy consumption is 36.4aJ. Moreover, in Fig. 6 we use the sampling and recovery algorithm discussed in [5], [23] to evaluate the performance of ASSIST for different values of undersampling ratios,  $\frac{M}{N}$ , for a signal with sparsity level of  $\frac{k}{N} = 0.1$  considering N = 200 and with RoI that occupies 10% of the entire signal. This experiment shows that the proposed ASSIST is able to decrease the Time-Averaged Normalized Mean Squared Error (TNMSE) of RoI coefficients up to 2dB for various undersampling ratios. This benefit comes at the cost of reduced performance on total recovery error. It is worth noting that for smaller undersampling ratios, ASSIST incurs no additional performance degradation compared to uniform CS for non-RoI entries.

# V. CONCLUSIONS

We have devised a spin-based non-uniform compressive sensing circuit-algorithm solution that considers the signal dependent constraint as well as hardware limitations called Adaptive Sampling of Sparse IoT signals via STochastic-oscillators (ASSIST). High payoff considerations to leverage for device hardware optimization which are advanced herein include the signal sparsity and noise levels. According to our simulation result, the MRAM-based Stochastic Oscillator (MSO) used as a TRNG provides significant area improvement of  $\sim$  3-fold while achieving energy consumption per bit reduction of  $\sim$  9-fold, on average, compared to similar TRNGs presented in the literature. Additionally, our circuit-algorithm simulation results indicate that ASSIST efficiently achieves the desired non-uniform recovery of the original signals with varying sparsity rates and noise levels.

#### ACKNOWLEDGEMENT

This work was supported in part by the Center for Probabilistic Spin Logic for Low-Energy Boolean and Non-Boolean Computing (CAPSL), one of the Nanoelectronic Computing Research (nCORE) Centers as task 2759.006, a Semiconductor Research Corporation (SRC) program sponsored by the NSF through CCF 1739635, and by NSF through ECCS 1810256.



Fig. 5: Transient output for SHE-MRAM NVM array: writing and reading a (a) '0' bit, and (b) '1' bit.



Fig. 6: TNMSE vs. Undersampling Ratio,  $\frac{M}{N}$ , for a signal with  $\frac{k}{N} = 0.1$ , N = 200, and RoI occupying 10% of N.

#### REFERENCES

- [1] S. Salehi, M. B. Mashhadi, A. Zaeemzadeh, N. Rahnavard, and R. F. DeMara, "Energy-Aware Adaptive Rate and Resolution Sampling of Spectrally Sparse Signals Leveraging VCMA-MTJ Devices," *IEEE Journal on Emerging and Selected Topics in Circuits and Systems*, vol. 8, pp. 679–692, 12 2018.
- [2] S. Salehi, R. Zand, A. Zaeemzadeh, N. Rahnavard, and R. F. DeMara, "Aqurate: Mram-based stochastic oscillator for adaptive quantization rate sampling of sparse signals," in *Proceedings of 2019 Great Lakes Symposium on VLSI*, GLSVLSI '19, (Tysons Corner, VA, USA), 2019.
- [3] S. Sarvotham, D. Baron, R. G. Baraniuk, S. Sarvotham, D. Baron, and R. G. Baraniuk, "Measurements vs. Bits: Compressed Sensing meets Information Theory," *Allerton Conference on Communication, Control* and Computing, 9 2006.
- [4] N. Rahnavard, A. Talari, and B. Shahrasbi, "Non-uniform compressive sensing," in *Communication, Control, and Computing (Allerton), 2011* 49th Annual Allerton Conference on, pp. 212–219, IEEE, 2011.
- [5] A. Zaeemzadeh, M. Joneidi, and N. Rahnavard, "Adaptive non-uniform compressive sampling for time-varying signals," in 2017 51st Annual Conference on Information Sciences and Systems (CISS), (Baltimore, MD), pp. 1–6, IEEE, 3 2017.
- [6] B. Shahrasbi and N. Rahnavard, "Model-Based Nonuniform Compressive Sampling and Recovery of Natural Images Utilizing a Wavelet-Domain Universal Hidden Markov Model," *IEEE Transactions on Signal Processing*, vol. 65, pp. 95–104, 1 2017.
- [7] D. Vodenicarevic, N. Locatelli, A. Mizrahi, J. Friedman, A. Vincent, M. Romera, A. Fukushima, K. Yakushiji, H. Kubota, S. Yuasa, S. Tiwari, J. Grollier, and D. Querlioz, "Low-Energy Truly Random Number Generation with Superparamagnetic Tunnel Junctions for Unconventional Computing," *Physical Review Applied*, vol. 8, p. 054045, 11 2017.
- [8] Y. Qu, J. Han, B. F. Cockburn, W. Pedrycz, Y. Zhang, and W. Zhao, "A True Random Number Generator Based on Parallel STT-MTJs," in *Proceedings of the Conference on Design, Automation & Test in Europe (DATE '17)*, pp. 606–609, 2017.
- [9] Y. Wang, H. Cai, L. Alves De Barros Naviner, J.-O. Klein, and W. Zhao, "A Novel Circuit Design of True Random Number Gen-

erator Using Magnetic Tunnel Junction," in *IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)*, pp. 123–128, 2016.

- [10] N. Rangarajan, A. Parthasarathy, and S. Rakheja, "A Spin-based True Random Number Generator Exploiting the Stochastic Precessional Switching of Nanomagnets," *Journal of Applied Physics*, vol. 121, p. 223905, 6 2017.
- [11] H. Lee, F. Ebrahimi, P. K. Amiri, and K. L. Wang, "Design of high-throughput and low-power true random number generator utilizing perpendicularly magnetized voltage-controlled magnetic tunnel junction," *AIP Advances*, vol. 7, p. 055934, 5 2017.
  [12] D. Bellasi, L. Bettini, T. Burger, Q. Huang, C. Benkeser, and C. Studer,
- [12] D. Bellasi, L. Bettini, T. Burger, Q. Huang, C. Benkeser, and C. Studer, "A 1.9 GS/s 4-bit sub-Nyquist flash ADC for 3.8 GHz compressive spectrum sensing in 28 nm CMOS," in 2014 IEEE 57th International Midwest Symposium on Circuits and Systems (MWSCAS), pp. 101– 104, IEEE, 8 2014.
- [13] T.-F. Wu, C.-R. Ho, and M. S.-W. Chen, "A Flash-Based Non-Uniform Sampling ADC With Hybrid Quantization Enabling Digital Anti-Aliasing Filter," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 9, pp. 2335–2349, 2017.
- [14] K. Y. Camsari, S. Salahuddin, and S. Datta, "Implementing p-bits with embedded mtj," *IEEE Electron Device Letters*, vol. 38, no. 12, pp. 1767–1770, 2017.
- [15] A. Gilbert and P. Indyk, "Sparse recovery using sparse matrices," Proceedings of the IEEE, vol. 98, no. 6, pp. 937–947, 2010.
- [16] S. Jafarpour, W. Xu, B. Hassibi, and R. Calderbank, "Efficient and robust compressed sensing using optimized expander graphs," *IEEE Transactions on Information Theory*, vol. 55, no. 9, pp. 4299–4308, 2009.
- [17] H. Kung and S. J. Tarsa, "Partitioned compressive sensing with neighbor-weighted decoding," in 2011-MILCOM 2011 Military Communications Conference, pp. 149–156, IEEE, 2011.
- [18] A. Zaeemzadeh, J. Haddock, N. Rahnavard, and D. Needell, "A Bayesian Approach for Asynchronous Parallel Sparse Recovery," in 52nd Asilomar Conference on Signals, Systems, and Computers, (Pacific Grove, CA), pp. 1980–1984, IEEE, 2018.
- [19] L. Gan, "Block compressed sensing of natural images," in 2007 15th International conference on digital signal processing, pp. 403–406, IEEE, 2007.
- [20] Y. Yu, B. Wang, and L. Zhang, "Saliency-based compressive sampling for image signals," *IEEE signal processing letters*, vol. 17, no. 11, pp. 973–976, 2010.
- [21] Y. Shen, W. Hu, R. Rana, and C. T. Chou, "Nonuniform compressive sensing for heterogeneous wireless sensor networks," *IEEE Sensors journal*, vol. 13, no. 6, pp. 2120–2128, 2013.
- [22] Y. Liu, X. Zhu, L. Zhang, and S. H. Cho, "Expanding window compressed sensing for non-uniform compressible signals," *Sensors*, vol. 12, no. 10, pp. 13034–13057, 2012.
  [23] A. Zaeemzadeh, M. Joneidi, N. Rahnavard, and G. Qi, "Co-spot: Co-
- [23] A. Zaeemzadeh, M. Joneidi, N. Rahnavard, and G. Qi, "Co-spot: Cooperative spectrum opportunity detection using bayesian clustering in spectrum-heterogeneous cognitive radio networks," *IEEE Transactions* on Cognitive Communications and Networking, vol. 4, pp. 206–219, June 2018.
- [24] R. Zand, A. Roohi, D. Fan, and R. F. DeMara, "Energy-Efficient Nonvolatile Reconfigurable Logic Using Spin Hall Effect-Based Lookup Tables," *IEEE Transactions on Nanotechnology*, vol. 16, no. 1, pp. 32– 43, 2017.
- [25] S. Salehi and R. F. DeMara, "BGIM: Bit-Grained Instant-on Memory Cell for Sleep Power Critical Mobile Applications," in 2018 IEEE 36th International Conference on Computer Design (ICCD), pp. 342–345, IEEE, 10 2018.
- [26] B. Sutton, K. Y. Camsari, B. Behin-Aein, and S. Datta, "Intrinsic optimization using stochastic nanomagnets," *Scientific reports*, vol. 7, p. 44370, 2017.
- [27] R. Zand, K. Y. Camsari, S. D. Pyle, I. Ahmed, C. H. Kim, and R. F. DeMara, "Low-energy deep belief networks using intrinsic sigmoidal spintronic-based probabilistic neurons," in *Proceedings of the 2018 on Great Lakes Symposium on VLSI*, GLSVLSI '18, (Chicago, IL, USA), pp. 15–20, ACM, 2018.
- [28] P. Debashis, R. Faria, K. Y. Camsari, and Z. Chen, "Design of stochastic nanomagnets for probabilistic spin logic," *IEEE Magnetics Letters*, vol. 9, pp. 1–5, 2018.
- [29] A. Stillmaker and B. Baas, "Scaling equations for the accurate prediction of CMOS device performance from 180nm to 7nm," *Integration*, vol. 58, pp. 74–81, 6 2017.