Sridhar Rajagopal : Research Contributions and Statement of Research Interests

Stream Processors for wireless base-stations(NEW)

Research Contributions:

My research contributions have been in the area of designing efficient algorithms and architectures that can meet real-time requirements (10-100 Mbps) for the physical layer in emerging high data rate wireless systems. I have explored the joint algorithm-architecture design space at various levels in my research and this has provided me with critical insights in providing real-time solutions for wireless systems. My contributions include (i) algorithm design for lower complexity solutions, (ii) task-partitioning algorithms on existing FPGA and DSP architectures (iii) investigating computer arithmetic for providing area-time-power benefits and (iv) designing real time ASICs and scalable, programmable architectures.

Programmable architecture designs for wireless systems

My dissertation work explores scalable multi-cluster programmable architecture designs for meeting real-time requirements in high data rate wireless systems. My objective is to identify and solve bottlenecks that arise while converting a hand scheduled ASIC design into code while designing programmable architectures. My dissertation work includes dynamically scaling the number of clusters in the architecture to varying levels of data parallelism, minimizing inter-cluster communication latencies, exploring trade-offs between memory stalls and functional unit utilization and rapid prototyping of emerging wireless communication systems.

Computer arithmetic for wireless systems

My contributions include evaluating use of on-line arithmetic using Most Significant Digit First computations for truncated operations such as sign detection. The ability of on-line arithmetic to process data in the most significant digit first manner, allows us to truncate the result and stop computations of successive digits as soon as the first non-zero digit is received for sign detection, resulting in a higher throughput due to elimination of unnecessary computations. I have also supervised students on building a code-matched filter that uses these techniques, as part of the VLSI design course in 2002.

Task-partitioning and scheduling wireless algorithms on heterogeneous architectures

My contributions include scheduling and partitioning multiuser channel estimation and detection algorithms on an existing heterogeneous system that contained two DSPs and two FPGAs. The algorithms were scheduled on the architectures to provide high data rates by minimizing the inter-processor communication and by optimizing use of DSP and FPGA architectures.

Efficient algorithm and ASIC designs for wireless systems

My contributions include(i) evaluating a gradient descent based algorithm to replace matrix inversions for multiuser channel estimation, which results in fixed-point parallel architectures without loss in bit error rates, (ii) pipelining a block-based multiuser detection algorithm based on parallel interference cancellation such that it reduces the memory requirements by the block length and provides simpler and faster implementations in ASICs and DSPs. I have also directed students to build an asynchronous multiuser detector with 3 parallel interference cancellation stages, as part of the VLSI design course in 2000.

Research Statement:

The last five years have seen a dramatic rise in consumer electronic products such as digital cameras, PDAs, laptops, cell phones and pocket PCs. The design of high performance architectures with low power consumption will have a tremendous impact on such embedded systems. My current research explores the impact of designing such high performance architectures for emerging wireless communication systems that can scale and adapt to future computing requirements. My research goal is to develop a design methodology to meet both real-time and power consumption requirements for embedded systems while simultaneously providing the flexibility and rapid prototyping capabilities of a programmable solution. My research will enable cell phones to run more sophisticated algorithms at higher data rates to support multimedia applications, digital cameras and camcorders to run more complex decoding algorithms for better resolution, and will contribute to new innovations in the design and use of future embedded systems. My scalable and programmable designs will also enable rapid evaluation of new algorithms and faster time-to-market for these embedded systems. This research builds on my past experience at Rice University in designing algorithms, VLSI architectures, and scalable, programmable architectures for baseband processing in wireless systems.

The rapid evolution of wireless systems has led to a plethora of standards for indoor and outdoor environments such as W-CDMA, 802.11a W-LAN and Bluetooth. New mobile devices need to support a variety of physical layers for each standard. Also, the need in emerging systems to provide better error rate performance by minimizing interference among users, using multiple antennas and providing higher spectral efficiency (in bits/sec/Hz) requires the use of highly sophisticated algorithms. Thus, it is extremely challenging to attain real-time performance for these wireless systems at high data rates (10-100 Mbps, depending on the standard). Traditional designs for wireless systems have typically been as customized ASICs (Application-Specific Integrated Circuits) in order to meet high data rate requirements and to minimize power consumption. However, the need to rapidly prototype algorithms for ever-evolving standards, long design time and the flexibility needed to support multiple standards make programmable solutions increasingly important in emerging wireless systems.

Current programmable architectures fail to meet both the real-time and power consumption requirements for implementing proposed computationally-intensive wireless algorithms at these high data rates. Hence, heterogeneous solutions incorporating co-processors have been proposed for wireless systems such as the C6416 DSP (Digital Signal Processor) from Texas Instruments, which has both a Viterbi and Turbo co-processor for decoding. In such heterogeneous systems, the role of the DSP is becoming increasingly diminished to that of a co-processor controller. As more co-processors are added for supporting advanced algorithms in the future, the benefits of the programmable solution will be lost and the system will become as inflexible as a custom ASIC implementation. Current programmable architectures also lack the scalability and robustness needed to support more computationally-intensive algorithm designs for ever-evolving wireless standards, while still meeting real-time and power consumption requirements.

I have investigated programmable architectures for primarily meeting real-time requirements in wireless systems as part of my Ph.D. dissertation. Power efficiency is provided in the design by (i) dynamically adapting the architecture to the appropriate parallelism level in the algorithms so that unused portions could be powered down and (ii) using fixed-point arithmetic units that could exploit sub-word parallelism. This design, while being sufficient for real-time performance and for base-station implementations, will require radical modifications to meet the power consumption requirements in future mobile devices having extremely low power consumption and supporting high data rates (in Mbps). This is one of my future research goals. I plan to use my expertise in VLSI and in providing scalable, programmable real-time architecture designs to also meet power consumption requirements for future mobile devices. A power-efficient real-time architecture design will require a deep insight into the VLSI layout of the architecture. My new power-efficient design, based on my insights derived from the implemented algorithms and my current real-time architecture design, will place restrictions on the inter-cluster communication capabilities and limit the maximum register size and number of arithmetic units. My design will investigate new low power functional units such as application-specific units and truncated multipliers for the architecture. Accurate power and performance models for the programmable architectures will then be incorporated into an architecture simulator so that architecture scaling for different algorithms, real-time performance and power consumption can all be investigated simultaneously.

VLSI design is also important for ASIC implementations of algorithms proposed for emerging wireless systems, especially when they fail to meet real-time requirements on programmable implementations. ASIC designs provide the bounds on the best power and real-time performance that can be obtained for the algorithm implementation. I have gained considerable insight by looking at both programmable and ASIC solutions, enabling me to quantify the bottlenecks in programmable architectures with respect to ASIC implementations. This has lead to research in providing architecture support to eliminate or minimize these bottlenecks and bridging the gap between ASICs and programmable architectures in wireless system designs. VLSI design space exploration also includes investigating computer arithmetic techniques for area-time-power tradeoffs. Advances in computer arithmetic such as redundant number systems, fast and low power adders, multipliers, and truncated arithmetic circuits also significantly impact both ASICs and the VLSI layout of programmable architectures. My research contributions in ASIC design and computer arithmetic for wireless systems provide me with the expertise needed in this design phase.

Finally, it is important to choose the right algorithms for evaluation on programmable and ASIC solutions. Emerging wireless systems will employ sophisticated algorithms such as long spreading codes, space-time codes on multiple antennas, LDPC (Low Density Parity Check) codes and Turbo codes. The proposed algorithms need to be evaluated for performance factors such as bit error rate and spectral efficiency, as well as implementation factors such as computational complexity, fixed-point implementation suitability and parallelism. I have explored techniques from numerical linear algebra in order to exploit structure in the algorithms to enable lower complexity, finite precision and parallel implementations. This also simplifies scheduling of the algorithm on ASICs and programmable architectures and has a significant impact on architecture design.

In my research at Rice University these past five years, I have gained expertise in investigating joint algorithm-architecture designs for wireless systems. My research contributions in diverse areas of wireless communications, VLSI and programmable architectures has given me insight in designing efficient wireless algorithms and providing scalable, programmable, real-time and power-efficient architecture solutions for emerging high data rate wireless systems. My research has generated significant interest from the wireless industry in companies such as Nokia, Texas Instruments, National Instruments and in start-ups such as Sigprowireless and Chameleon systems. My graduate research has also been funded in part by Nokia and Texas Instruments. Collaborations and feedback from Nokia and Texas Instruments has significantly influenced my research these past five years at Rice University. I plan on continuing my research interactions with industry in the future. My core strengths in high performance computing for embedded applications will also enable me to branch into related research areas and have synergistic research interactions with both the Electrical Engineering and Computer Science departments at your university.

Back to Homepage