The high bandwidth density and low power consumption characteristics of silicon photonics devices can provide
a high performance interconnect solution for multiprocessor systems. At the same time this technology also
poses a new set of constraints and challenges in architecting, designing, and integrating such systems.
The "macrochip" multiprocessor architecture leverages a photonically interconnected array of processor and/or
memory chips to provide a flexible platform to build heterogeneous systems. The design considerations for such
a system are influenced largely by the system architecture, the programming model and devices needed for
their implementation. This talk will first describe the macrochip platform, technology constraints and potential
interconnect solutions with the various device building blocks. Then it will present some topology choices that
range from a WDM point-to-point interconnect to more complex switched data channel networks. It will close
with a detailed analysis of these design choices and show the impact of the device constraints on performance
and power consumption along with some recent ultra-low power device implementation results.
Scaling of computing systems require ultra-efficient interconnects with large bandwidth density. Silicon photonics
offers a disruptive solution with advantages in reach, energy efficiency and bandwidth density. We review our progress
in developing building blocks for ultra-efficient WDM silicon photonic links. Employing microsolder based hybrid
integration with low parasitics and high density, we optimize photonic devices on SOI platforms and VLSI circuits on
more advanced bulk CMOS technology nodes independently. Progressively, we successfully demonstrated single
channel hybrid silicon photonic transceivers at 5 Gbps and 10 Gbps, and 80 Gbps arrayed WDM silicon photonic
transceiver using reverse biased depletion ring modulators and Ge waveguide photo detectors. Record-high energy
efficiency of less than 100fJ/bit and 385 fJ/bit were achieved for the hybrid integrated transmitter and receiver,
respectively. Waveguide grating based optical proximity couplers were developed with low loss and large optical
bandwidth to enable multi-layer intra/inter-chip optical interconnects. Thermal engineering of WDM devices by
selective substrate removal, together with WDM link using synthetic wavelength comb, we significantly improved the
device tuning efficiency and reduced the tuning range. Using these innovative techniques, two orders of magnitude
tuning power reduction was achieved. And tuning cost of only a few 10s of fJ/bit is expected for high data rate WDM
silicon photonic links.
Electroabsorption from GeSi on silicon-on-insulator (SOI) is expected to have promising
potential for optical modulation due to its low power consumption, small footprint, and more
importantly, wide spectral bandwidth for wavelength division multiplexing (WDM) applications.
Germanium, as a bulk crystal, has a sharp absorption edge with a strong coefficient at the direct
band gap close to the C-band wavelength. Unfortunately, when integrated onto Silicon, or when
alloyed with dilute Si for blueshifting to the C-band operation, this strong Franz-Keldysh (FK)
effect in bulk Ge is expected to degrade. Here, we report experimental results for GeSi epi when
grown under a variety of conditions such as different Si alloy content, under selective versus non
selective growth modes for both Silicon and SOI substrates. We compare the measured FK effect
to the bulk Ge material.
Reduced pressure CVD growth of GeSi heteroepitaxy with various Si content was studied
by different characterization tools: X-ray diffraction (XRD), atomic force microscopy (AFM),
secondary ion mass spectrometry (SIMS), Hall measurement and optical transmission/absorption
to analyze performance for 1550 nm operation. State-of-the-art GeSi epi with low defect density
and low root-mean-square (RMS) roughness were fabricated into pin diodes and tested in a
surface-normal geometry. They exhibit low dark current density of 5 mA/cm2 at 1V reverse bias
with breakdown voltages of 45 Volts. Strong electroabsorption was observed in our GeSi alloy
with 0.6% Si content having maximum absorption contrast of Δα/α ~5 at 1580 nm at 75 kV/cm.
We present a hybrid integration technology platform for the compact integration of best-in-breed VLSI and photonic
circuits. This hybridization solution requires fabrication of ultralow parasitic chip-to-chip interconnects on the candidate
chips and assembly of these by a highly accurate flip-chip bonding process. The former is achieved by microsolder bump
interconnects that can be fabricated by wafer-scale processes, and are shown to have average resistance <1 ohm/bump
and capacitance <25fF/bump. This suite of technologies was successfully used to hybrid integrate high speed VLSI chips
built on the 90nm bulk CMOS technology node with silicon photonic modulators and detectors built on a 130nm
CMOS-photonic platform and an SOI-photonic platform; these particular hybrids yielded Tx and Rx components with
energies as low as 320fJ/bit and 690fJ/bit, respectively. We also report on challenges and ongoing efforts to fabricate
microsolder bump interconnects on next-generation 40nm VLSI CMOS chips.
Ring waveguide resonating structures with high quality factors are the key components servicing silicon
photonic links. We demonstrate highly efficient spectral tunability of the microphotonic ring structures
manufactured in commercial 130 nm SOI CMOS technology. Our rings are fitted with dedicated heaters
and integrated with silicon micro-machined features. Optimized layout and structure of the devices result in
their maximized thermal impedance and increased efficiency of the thermal tuning.
Silicon-based optical interconnects are expected to provide high bandwidth and low power consumption solutions for
chip-level communication applications, due to their electronics integration capability, proven manufacturing record and
attractive price volume curve. In order to compete with electrical interconnects, the energy requirement is projected to be
sub-pJ per bit for an optical link in chip to chip communication. Such low energies pose significant challenges for the
optical components used in these applications. In this paper, we review several low power photonic components
developed at Kotura for DARPA's Ultraperformance Nanophotonic Intrachip Communications (UNIC) project. These
components include high speed silicon microring modulators, wavelength (de)multiplexers using silicon cascaded
microrings, low power electro-optic silicon switches, low loss silicon routing waveguides, and low capacitance
germanium photodetectors. Our microring modulators demonstrate an energy consumption of ~ 10 fJ per bit with a drive
voltage of 1 V. Silicon routing waveguides have a propagation loss of < 0.3 dB/cm, enabling a propagation length of a
few tens of centimeters. The germanium photodetectors can have a low device capacitance of a few fF, a high
responsivity up to 1.1 A/W and a high speed of >30 GHz. These components are potentially sufficient to construct a full
optical link with an energy consumption of less than 1 pJ per bit.
Silicon photonics is envisioned as a promising solution to address the interconnect bottleneck
in large-scale multi-processor computing systems, owing to advantageous attributes such as wide
bandwidth, high density, and low latency. To leverage these advantages, optical proximity coupler is
one of the critical enablers. Chip-to-chip, layer-to-layer optical proximity couplers with low loss,
large bandwidth, small footprint and integration compatibility are highly desirable. In this paper, we
demonstrate chip-to-chip optical proximity coupling using grating couplers. We report the
experimental results using grating couplers fabricated in a photonically-enabled commercial 130nm
SOI CMOS process.
Scaling of high performance, many-core, computing systems calls for disruptive solutions to provide ultra energy
efficient and high bandwidth density interconnects at very low cost. Silicon photonics is viewed as a promising solution.
For silicon photonics to prevail and penetrate deeper into the computing system interconnection hierarchy, it requires
innovative optical devices, novel circuits, and advanced integration. We review our recent progress in key building
blocks toward sub pJ/bit optical link for inter/intra-chip applications, ultra-low power silicon photonic transceivers. In
particular, compact reverse biased silicon ring modulator was developed with high modulation bandwidth sufficient for
15Gbps modulation, very small junction capacitance of ~50fF, low voltage swing of 2V, high extinction ratio (>7dB)
and low optical loss (~2dB at on-state). Integrated with low power CMOS driver circuits using low parasitic microsolder
bump technique, we achieved record low power consumption of 320fJ/bit at 5Gbps data rate. Stable operation with biterror-
rate better than 10-13 was accomplished with simple thermal management. We further review the first hybrid
integrated silicon photonic receiver based on Ge waveguide photo detector using the same integration technique, with
which high energy efficiency of 690fJ/bit, and sensitivity of ~18.9dBm at 5Gbps data rate for bit-error-rate of 10-12 were
achieved.
Ring waveguide resonating structures with high quality factors are the key components in the silicon photonics portfolio
boosting up its functionality and circuit performance. Due to a number of manufacturing reasons their peak wavelengths
are often prone to deviate from designed values. In order to keep the ring resonator operating as specified, its peak
wavelength then needs to be corrected in a reliable and power efficient way. We demonstrate the performance of the
thermally tunable mux/demux filter ring structures fabricated in the commercial 130 nm SOI CMOS line.
In this paper we present a computing system that uniquely leverages the bandwidth, density, and
latency advantages of silicon photonic interconnects to enable highly compact supercomputerscale
systems. We present the details of an optically enabled "macrochip" which is a set of
contiguous, optically-interconnected chips that deploy wavelength-division multiplexed (WDM)
enabled by silicon photonics. We describe the system architecture and the WDM point-to-point
network implementation of a "macrochip" providing bisection bandwidth of 10 TBps and discuss
system and device level challenges, constraints, and the critical technologies needed to implement
this system. We present a roadmap to lowering the energy-per-bit of a silicon photonic
interconnect and highlight recent advances in silicon photonics under the UNIC program that
facilitate implementation of a "macrochip" system made of arrayed chips.
We report a very compact (1.6μmx10μm) and low dark current (20nA) Germanium p-i-n photodetector integrated on
0.25μm thick silicon-on-insulator (SOI) waveguides. A thin layer of Germanium was selective-epitaxially grown on top
of SOI waveguides. Light is evanescently coupled into Germanium layer from the bottom SOI waveguide. The device
demonstrates superior performance with demonstrated responsivity of 0.9A/W and 0.56A/W at wavelength of 1300nm
and 1550nm, respectively, and dark current less than 20nA at -0.5V bias. The 3dB bandwidth of the device is measured
to be 23GHz at -0.5V bias.
The Ultra-performance Nanophotonic Intrachip Communication (UNIC) project aims to achieve unprecedented high-density,
low-power, large-bandwidth, and low-latency optical interconnect for highly compact supercomputer systems.
This project, which has started in 2008, sets extremely aggressive goals on power consumptions and footprints for
optical devices and the integrated VLSI circuits. In this paper we will discuss our challenges and present some of our
first-year achievements, including a 320 fJ/bit hybrid-bonded optical transmitter and a 690 fJ/bit hybrid-bonded optical
receiver. The optical transmitter was made of a Si microring modulator flip-chip bonded to a 90nm CMOS driver with
digital clocking. With only 1.6mW power consumption measured from the power supply voltages and currents, the
transmitter exhibits a wide open eye with extinction ratio >7dB at 5Gb/s. The receiver was made of a Ge waveguide
detector flip-chip bonded to a 90nm CMOS digitally clocked receiver circuit. With 3.45mW power consumption, the
integrated receiver demonstrated -18.9dBm sensitivity at 5Gb/s for a BER of 10-12. In addition, we will discuss our
Mux/Demux strategy and present our devices with small footprints and low tuning energy.
We review the progress and challenges in scaling computing systems; discuss the
potential benefits and challenges for achieving optical-interconnects to the chip via the
native integration of silicon photonics components with VLSI electronics; and introduce
the "macrochip" - a collection of contiguous silicon chips enabled by optical proximity
communication
We introduce a novel approach to interconnect multiple chips together with a silicon
photonic WDM point-to-point network enabled by optical proximity communications to act as a
single large piece of logical silicon much larger than a single reticle limit. We call this structure a
macrochip. This non-blocking network provides all-to-all low-latency connectivity while
maximizing bisection bandwidth, making it ideal for multi-core and multi-processor
interconnections. We envision bisection bandwidth up to TBps for an 8x8 macrochip design. And a
5-6x improvement in latency can be achieved when compared to a purely electronic implementation.
We also observe better overall performance over other optical network architectures.
We review 10Gb/sec Optical Proximity Communication realized with packaged chips that carry SOI
optical waveguides and reflecting mirrors micromachined in silicon. The high precision chip to chip
alignment and placement was enabled by a new packaging concept based on the co-integration of
pyramidal pit features defined by anisotropic silicon etch and matching high precision micro-spheres. We
support this novel packaging approach with measured optical transmission data and discuss the extent of it
towards other applications of Proximity Communication.
We evaluate VCSEL interconnects for next-generation High Productivity Computers in which hundreds of terabits of bandwidth are envisioned. We present results for VCSEL based links operating PAM-4 signaling using a commercial 0.13μm CMOS technology. We perform a complete link analysis of the Bit Error Rate, Q factor, random and deterministic jitter by measuring waterfall curves versus margins in time and amplitude. We demonstrate that VCSEL based PAM-4 can match or even improve performance over binary signaling under conditions of bandwidth limited 100meter multi-mode optical link at 5Gbps. We present the first sensitivity measurements for optical PAM-4 and compare it with binary signaling. An empirical relationship for VCSEL scaling versus bit rate and aperture is presented in order to explore reliability of VCSEL-based links. Reliability is found to degrade with aperture with a fourth order power law dependence.
We discuss the technical rationale, challenges, and potential for achieving the intimate integration of photonics components such as lasers, detectors, and modulators with VLSI electronics and review the progress made towards commercializing this technology for high-density optical transceivers and switching products.
Parallel optical interconnects based on vertical-cavity surface-emitting lasers are being widely deployed today in switching and routing systems. Almost all of the major OEMs today are using parallel optics in their flagship products to solve interconnect problems where they are most relevant: at the bay-to-bay, shelf-to-shelf, and card-to-card levels. The density and capacity of these systems are already being constrained by the capacity of these one-dimensional links. We expect that next generations of systems will need high-density 2D parallelinks to further improve the performance-cost metric. Ultimately, we believe this metric can be optimized by directly integrating interconnect together with data processing and switching circuitry.
This paper describes the design, electrical nd optical test results for a 500Mb/s, 32-channel VCSEL driver ICs with built-in self-test and clock generation circuitry. The circuit design and silicon parts are available to the research community through the Consortium for Optical and Optoelectronic Technologies in Computing and the Optoelectronics Industry Association. This device is specifically targeted at users building VCSEL-based smart photonic system demonstrators.
Hybrid integration of optoelectronic devices, such as GaAs MQW modulators, to CMOS VLSI circuits provides the opportunity to design ICs that integrate millions of transistors and thousands of high-speed optical I/Os for high-performance computing and switching applications. One of the challenges in designing such large-scale ICs lies in the development of an efficient method for integrating existing VLSI circuit layouts with 2D arrays of optoelectronic devices. This paper presents several such methods and describes their application.
Novel diffractive optical elements (DOE) with multifunctionality in polarization or color are reviewed. We review three technological approaches for construction of such DOEs with multifunctionality in polarization: the two- substrate birefringent computer generated hologram (BCGH), the multiple order delay BCGH, and the form birefringent computer generated hologram approaches. We also discuss the accurate design of such DOEs enabled by our modeling tools based on rigorous coupled wave analysis. Microfabrication techniques developed for realization of these three types of polarization selective DOEs are described. The developed DOEs with multifunctionality in polarization or color are used to package a 3D optoelectronic VLSI chip, a transparent optical multistage interconnection network, and a wavelength division demultiplexer, providing mechanical and thermal stability, light efficiency, reduced volume, weight, and cost, and increased reliability.
We review recent results from the hybrid integration of silicon CMOS VLSI reticles with GaAs/AlGaAs Multiple Quantum Well modulators that have been made in conjunction with multi-project hybrid CMOS/Modulator foundry services.
We describe the parallel optical test of a photonic first-in-first-out (PFIFO) buffer memory based on flip-chip CMOS/SEED optoelectronic technology. The PFIFO detects pages of 4 by 8 binary bits, stores them in a 32 bit deep buffer memory, and transmits them in a 16 by 2 output modulator array. All 32 I/O and memory channels functioned, with an average input power of 40 microwatts and a minimum output contrast ratio of 2:1.
We present the Stretch switch; a new class of self-routing multistage interconnection networks that provides a continuous performance-cost tradeoff between two of its degenerate forms: the Knockout switch and the Tandem banyan network. Stretch networks utilize simple destination tag routing and can be designed to achieve low delay and arbitrarily low blocking probabilities for random, permutation, and non-uniform traffic without using internal buffers in the switches. These qualities make them ideally suited for both fast packet (ATM) switching and multiprocessor architectures, and facilitate efficient VLSI and photonic implementations.
Image Processing is widely regarded as a successful application of electronic technology. However, there is a wide gap between commercially successful image processing implementations and system design that have been proposed by the research community. Of particular interest are image processing architectures that use large number of densely interconnected processors to compute and transmit image date in parallel 2-D format. The interconnect density and packaging constraints of electronic packaging technology preclude efficient implementation of such architectures. In contrast, optoelectronic integrated circuits combined with surface-normal optical interconnects offer the promise of a technology platform that is inherently better suited for the implementation of interconnect intensive image processing applications.
There is a growing need in the telecommunication industry for a scalable switch that can provide high-throughput communication between a large number of I/O ports: a terabit switch. Recent advances in the area of fiber amplifiers has spurred interest in `transparent' optical networks, wherein communication between users is achieved without multiple conversions between the optical and electrical domains. Moreover, polarization compensators have been developed for single-mode fibers to allow automatic and stable control of the polarization stage of output optical signals. This may enable a polarization-independent switching system that uses polarization-dependent `all-optical' switches. Polarization switching has been widely proposed in the context of free-space optical multistage interconnection networks for switching applications as well as for multiprocessor interconnections. In this paper we suggest a novel polarization-controlled free-space optical switch and present the implementation and characterization of a 4 X 4 photonic switch. The switching system is based on a unique optical element capable of acting with an arbitrary independent phase function upon illumination with horizontally or vertically polarized monochromatic light. This element, known as a birefringent computer generated hologram (BCGH) is composed of two birefringent substrates, etched with a surface relief pattern and joined face to face. BCGH optical interconnects provide arbitrary independent, efficient responses to the two orthogonal linear polarizations, thereby reducing the number of optical components in the free-space optical system.
The latest experimental results of the motionless head parallel readout system for optical disks are presented. The system is designed to read data blocks encoded as 1D Fourier holograms distributed radially on the disk active surface. Such systems offer several advantages: high data rates, low retrieval times and simple optical implementation.
This paper describes a scalable, highly connected, 3D optoelectronic neural system that uses free-space optical interconnects with silicon-VLSI based hybrid optoelectronic circuits. The system design uses an efficient combination of pulse-width modulating optoelectronic neurons and pulse-amplitude modulating electronic synapses. A prototype system is built and applied to a simple classification problem. An optoelectronic testbench for evaluating learning algorithms suitable for the optoelectronic architecture is implemented. Future directions for the optoelectronic architecture are also discussed; these include limited interconnect neural systems and parallel weight loading that allow receptive fields of arbitrary sizes and connection multiplexing to be achieved.
Multistage interconnection networks based on the perfect shuffle topology are often suggested as candidates for large scale multiprocessor and broadband communication networks. The perfect shuffle interconnection requires global communication links that extend across the entire system and have a large number of wire crossovers. These constraints prohibit a scalable electronic implementation both within a VLSI chip and at the MCM or board levels. This paper presents the architecture of a scalable optoelectronic hardware module for building multistage interconnection networks. To achieve a scalable implementation, the design uses free-space optical interconnects for global communication links and electronic VLSI technology for local communication links and switching elements (e.g., smart pixel approach). Our approach is to engineer a network with the desired functionality, cost, and performance characteristics using generic hardware modules. In this paper, various applications are examined and their implementation using the proposed method is described.
KEYWORDS: Neurons, Optoelectronics, Modulators, Neural networks, Sensors, Prototyping, Optical interconnects, Analog electronics, Free space optics, Very large scale integration
We report the implementation of a prototype 3-D optoelectronic neural system that combines free-space optical interconnects with silicon-VLSI-based hybrid optoelectronic circuits. The prototype system consists of a 16-pixel input, 4-neuron hidden and a single-neuron output layer, where the denser input-to-hidden layer connections are optical. The input layer uses PLZT light modulators to generate optical outputs which are distributed to an optoelectronic analog neural network chip through space invariant holographic optical interconnects. Optical interconnections provide fan-out with negligible delay and allow the use of compact, purely on-chip electronic H-tree fan-in structures. The scalable prototype system achieves 8-bit electronic fan-in precision and a maximum speed of 640 million interconnections per second. The system was tested using synaptic weights learned off-system and applied to a simple line recognition task.
A motionless head 2-D parallel readout system for optical disks is presented. The system is designed to read data blocks encoded as 1-D Fourier holograms distributed radially on the disk active surface. Such systems offer several advantages: high data rates, low retrieval times, and simple implementation. It is used as the secondary storage of a high performance optoelectronic associative memory system.
The Programmable Opto-Electronic Multiprocessor (POEM) combines free-space optical interconnects, optoelectronic devices, and electronic processors to perform computations. This paper investigates a specific POEM architecture for a multistage interconnection network application. For the chosen system, there is an optimum combination of optics and electronics. The effect of varying optoelectronic device parameters on the system performance is also examined.
A motionless head 2-D parallel readout system for optical disks is presented. The system is designed to read data blocks encoded as 1-D Fourier holograms distributed radially on the disk active surface. Such systems offer several advantages: high data rates, low retrieval times, and simple implementation. It is used as the secondary storage of a high performance optoelectronic associative memory system.
High data rates, low retrieval times, and simple implementation are presently shown to be obtainable by means of a motionless-head 2D parallel-readout system for optical disks. Since the optical disk obviates mechanical head motions for access, focusing, and tracking, addressing is performed exclusively through the disk''s rotation. Attention is given to a high-performance associative memory system configuration which employs a parallel readout disk.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.