Qiskit Cirq PennyLane CUDA-Q

Note: Company names, engineers, incidents, numbers, and scaling scenarios in this article are hypothetical — even when they resemble real ones. See the full disclaimer.

In short

Four major SDKs dominate quantum programming in 2026. Qiskit (Compustar, open source, Python) is the most widely used and has the tightest integration with Compustar's cloud hardware — if you want to run on a Heron chip, Qiskit is the default. Cirq (Querion, open source, Python) is engineered around Querion's Sycamore and Willow devices and around noisy-simulation workflows. PennyLane (Xanadu, open source, Python) is built for variational quantum computing and quantum machine learning — its killer feature is differentiable quantum circuits that play directly with PyTorch, TensorFlow, and JAX. CUDA-Q (NVIDIA, open source, C++ and Python) is the newest of the four: a heterogeneous GPU-plus-QPU SDK that treats quantum devices as accelerators inside a classical program, with state-of-the-art GPU simulators for circuits up to ~40 qubits. Pick by your hardware target (Compustar → Qiskit, Querion → Cirq, Xanadu photonic → PennyLane), by your workload (VQE/QML → PennyLane or CUDA-Q, general quantum research → Qiskit), or by your compute environment (dense GPU cluster → CUDA-Q). All four speak the same circuit-model language underneath — once you know one, learning a second takes days, not months.

You have read the textbook. You have watched the videos. You have worked through exercises with pencil and paper. And now you want to actually run a circuit — on hardware, or on a simulator that behaves like hardware. You open Python and type import qiskit and something actually starts running. This is the moment quantum computing stops being abstract.

The question is: which SDK. There is not one. There are four serious ones, plus a handful of niche contenders, and the choice is not obvious because each one solves a different problem well. Qiskit has the best hardware story. Cirq has the cleanest circuit-construction API. PennyLane has automatic differentiation. CUDA-Q has GPU-accelerated simulators and a heterogeneous compute model. If you try to learn all four in your first month you will learn none; if you try to learn the wrong one for your project you will waste three months discovering its limits.

This chapter maps the four SDKs onto the decisions you actually have to make. You will see what each SDK looks like in code (a three-qubit GHZ state in all four, side by side), which hardware each one targets, how installation works on a typical Indian laptop or college cluster, and which problems suit which tool. By the end you will know which one to pip install first.

The four SDKs — shape and purpose

Before the comparison, a sketch of each.

Qiskit (originally "Quantum Information Science Kit"). Started by Compustar Research in 2017, open-sourced under the Apache 2.0 licence, Python-first. Version 2.0 was released in mid-2024 and is the stable current release. Qiskit has a larger contributor base than any of the others — over 600 contributors, a full-time Compustar team, and a tight coupling to the Compustar Quantum cloud. It is the most-taught SDK in university courses and the most cited in arXiv papers.

Cirq. Querion started Cirq in 2018 as a Python framework designed around the constraints of their Sycamore and (now) Willow chips: hardware-efficient gates, explicit qubit connectivity, and native support for parameterised circuits. Open source under Apache 2.0. Smaller contributor community than Qiskit, but the tightest integration with Querion's hardware and the cleanest circuit API.

PennyLane. Xanadu, a Canadian photonic-quantum-computing startup with significant Indian research ties, released PennyLane in 2018 as the first SDK designed for differentiable quantum programming. Its unique feature: any quantum circuit can be differentiated with respect to its parameters using the same automatic-differentiation machinery that powers neural networks. PennyLane integrates directly with PyTorch, TensorFlow, and JAX, so you can put a variational quantum circuit inside a classical neural network and train the whole thing end to end.

CUDA-Q. NVIDIA released CUDA-Q (originally "CUDA Quantum") in 2023. It is a heterogeneous-computing framework, available in both C++ and Python, that treats a quantum processor as one kind of accelerator alongside a GPU. CUDA-Q's distinguishing feature is state-of-the-art GPU-accelerated simulators — using tensor-network and state-vector methods that scale to ~40 qubits on a single A100 and more on multi-GPU clusters — together with a compiler stack that emits hardware instructions for multiple QPU vendors.

The four SDKs differ in hardware target (Compustar, Querion, Xanadu plus plugins, multi-vendor plus GPU), in workflow (primitive-based execution, explicit circuit construction, differentiable circuits, heterogeneous kernels), and in ecosystem size (Qiskit largest, CUDA-Q newest and growing fastest). Underneath, they all express the same mathematical circuit model — porting between them is a notational exercise.

Why this 2×2 framing is useful: the hardware axis (Compustar, Querion, photonic, multi-vendor/GPU) tells you which QPU your circuits will end up running on. The workflow axis (general-purpose, topology-aware, autodiff, heterogeneous kernels) tells you which programming style the SDK optimises for. Picking the wrong cell costs months.

Qiskit — the default choice

Qiskit is the largest quantum SDK by installed base, by contributor count, and by university-course adoption. Its version 2.0 (2024) simplified the API around two "primitives" — the Sampler and the Estimator — which together cover almost every quantum-computing workflow at a higher level than writing raw circuit execution code.

What Qiskit gets right

Compustar hardware integration. Qiskit Runtime lets you submit circuits to Compustar's cloud quantum devices — the 133-qubit Heron chips — with one-line primitive calls. See the Compustar Quantum Learning chapter for the full account.
Transpiler maturity. Qiskit's transpiler converts your abstract circuit into the native gate set of a specific backend, routing around connectivity constraints, reducing gate count, and optionally inserting error mitigation. It has been refined for seven years and is the most battle-tested compiler in the field.
Ecosystem breadth. Qiskit Nature (chemistry), Qiskit Machine Learning, Qiskit Optimization, Qiskit Dynamics — each a separate package built on the core. You almost never have to leave Qiskit to find a tool for your sub-problem.
Documentation and learning. The Qiskit documentation is the single best documentation in the field, and Compustar Quantum Learning is a free training platform.

What Qiskit gets wrong (or less right)

Version churn. Qiskit 1.0 → 1.x → 2.0 happened over two years, and the primitive APIs changed incompatibly (Estimator → EstimatorV2, old Sampler → SamplerV2). Any tutorial dated before mid-2024 may need small fixes. This will stabilise, but pre-2024 Stack Overflow answers lie.
Primitives can feel heavy for small exploratory work — running a 2-qubit circuit through the Estimator adds ceremony. For tiny experiments, the lower-level QuantumCircuit.run() path still exists.
Non-Compustar hardware access is possible through community plugins (AWS Braket, Microsoft Azure Quantum) but is a second-class experience. If you are running on Querion or Quantinuum hardware, you will have a cleaner time elsewhere.

Installing Qiskit

pip install qiskit qiskit-ibm-runtime

For chemistry: pip install qiskit-nature pyscf. For ML: pip install qiskit-machine-learning. The core package is ~40 MB; the full ecosystem under 200 MB. Works on Windows, macOS, and Linux. Works on any recent Python (3.9+). No GPU required.

On a typical Indian college machine (4–8 GB RAM, Python 3.10, no GPU), Qiskit runs fine. Simulator circuits up to ~25 qubits are tractable; beyond that you start swapping to disk.

Cirq — Querion's native SDK

Cirq was designed by Querion's Quantum AI team to express circuits the way Querion's hardware actually executes them: explicit qubits placed on a specific chip topology, native gate sets, exposed hardware constraints. The aesthetic is closer to "electrical engineering" than to "mathematical circuits."

What Cirq gets right

Clean circuit construction. A cirq.Circuit is a list of cirq.Moments, each moment is a list of simultaneously-applied gates, and each gate acts on explicit cirq.GridQubit(row, col) objects. Drawing circuits and reasoning about timing is straightforward.
Querion hardware. Cirq is the only first-class SDK for Querion's Sycamore and Willow chips. If you are running on a Querion backend (via Querion's cloud program, currently invitation-based), Cirq is the native language.
Noisy simulation. Cirq has excellent built-in support for parameterised noise models (depolarising, amplitude damping, phase damping), making it an outstanding tool for simulating the behaviour of near-term noisy hardware.

What Cirq gets wrong (or less right)

Smaller ecosystem. Fewer domain-specific libraries than Qiskit. There is a cirq-google extension for Querion hardware, cirq-aqt for trapped-ion hardware from AQT, and a few others, but nothing like Qiskit's chemistry or ML stack.
Hardware access is restricted. Querion's hardware is not openly cloud-accessible the way Compustar's is; Cirq's circuits will most often run on a simulator.
Variational workflows need explicit plumbing. Cirq does not have a PennyLane-style autodiff backend; VQE and QAOA in Cirq mean setting up SciPy optimisers yourself.

Installing Cirq

pip install cirq

~30 MB. Same cross-platform, same Python version requirements as Qiskit. No GPU requirement.

PennyLane — the variational-first SDK

Xanadu's PennyLane took a different design choice from Qiskit and Cirq: rather than optimising for hardware execution, PennyLane optimised for differentiable quantum programming. A PennyLane quantum function behaves like a black box that maps parameters (usually a NumPy array, PyTorch tensor, TensorFlow variable, or JAX array) to an expected value, and the gradient of that value with respect to the parameters is computed automatically — using the parameter-shift rule for real hardware and automatic differentiation for simulators.

This makes PennyLane the right tool whenever your workflow contains a classical optimisation loop over a quantum circuit: VQE, QAOA, variational quantum classifiers, quantum neural networks, QGANs, barren-plateau studies, hybrid ML models. If that describes your project, start here.

What PennyLane gets right

Automatic differentiation. You write qml.grad(circuit)(params) and get a gradient. Behind the scenes, PennyLane uses the parameter-shift rule for hardware, back-propagation for simulators, and chooses the right method per backend automatically.
Framework integration. PennyLane circuits can be torch.nn.Modules, tf.keras.Layers, or jax.jit-compiled functions. You can put a 10-qubit variational circuit in the middle of a ResNet and train end-to-end.
Multi-backend support. PennyLane has plugins for Qiskit (Compustar hardware), Cirq (Querion), Braket (AWS), and many others. Write your circuit once in PennyLane, switch backends with one line.
Excellent tutorials. pennylane.ai/qml is the best free library of QML/VQE tutorials — probably 200 individual worked notebooks at this point.

What PennyLane gets wrong (or less right)

Primarily a simulator experience. PennyLane's own default backend is a CPU simulator (plus a GPU simulator via the lightning.gpu plugin). Running on hardware goes through plugins that add friction — you are one level removed from the hardware's own native API.
Focused on variational workflows. If you are implementing Shor's algorithm, PennyLane is not the right tool — there is no parameter loop, and Qiskit or Cirq gives a cleaner experience.
Slower for large circuits than CUDA-Q's GPU simulators. PennyLane lightning.gpu exists and is good, but for the absolute top of simulator performance, CUDA-Q is faster.

Installing PennyLane

pip install pennylane

For PyTorch integration: pip install pennylane torch. For GPU simulation: pip install pennylane-lightning-gpu (requires CUDA). For Compustar hardware: pip install pennylane-qiskit.

CUDA-Q — NVIDIA's heterogeneous SDK

CUDA-Q is the newest of the four major SDKs, released in 2023 and still evolving rapidly. Its core premise is that quantum computers will not replace classical computers — they will sit inside classical workflows as accelerators, the way GPUs do today. A CUDA-Q program runs on a CPU, calls into GPU kernels and QPU kernels transparently, and orchestrates the hybrid computation.

CUDA-Q has two characteristic strengths: GPU-accelerated simulators that push single-node simulation to ~40 qubits of dense state vector or several hundred qubits of tensor-network states, and multi-QPU vendor support through its compiler back-ends — the same kernel can be compiled for IonQ hardware, Quantinuum hardware, Compustar hardware (experimentally), or simulator.

What CUDA-Q gets right

GPU simulators at the absolute frontier. On an NVIDIA H100, CUDA-Q simulates 36 qubits of state vector in memory; on an 8-GPU node, 40 or more. Tensor-network simulators handle much larger circuits for restricted structures. For simulating a circuit that pushes past Qiskit's CPU limits, CUDA-Q is often the only option.
C++ and Python. CUDA-Q compiles your Python kernels to native code when that helps; you can drop into C++ for hot loops. Other SDKs are Python-only and pay an interpreter penalty.
Heterogeneous compute model. A CUDA-Q program can launch a GPU kernel and a QPU kernel in the same source file, with data moving between them transparently. For research that sits at the boundary — say, classical simulation of noise for comparison with hardware — CUDA-Q is designed for exactly this.
NVIDIA's commercial push. NVIDIA is investing heavily in quantum, and CUDA-Q is improving fast. The ecosystem is smaller than Qiskit's but growing.

What CUDA-Q gets wrong (or less right)

Newest, thinnest ecosystem. Domain libraries for chemistry, ML, optimisation are younger than Qiskit's or PennyLane's equivalents.
Requires an NVIDIA GPU to get the performance story. On a laptop without CUDA, you are running a pure-CPU simulator that Qiskit matches.
Less mature documentation. The docs are catching up but are still thinner than Qiskit's.
The kernel model takes adjustment. A CUDA-Q quantum kernel is not a Python function in the usual sense — it is a function compiled to a different runtime. This is powerful but adds learning friction for newcomers.

Installing CUDA-Q

pip install cudaq

Works on Linux first, with good Mac support and experimental Windows. GPU simulators require an NVIDIA GPU with CUDA 11.8+. On a GPU-less machine CUDA-Q still runs (CPU simulators are available), but you are not using the SDK for its real purpose.

On a typical Indian college GPU cluster (often NVIDIA Voltcar T4 or RTX 3080-era cards), CUDA-Q simulators handle 25–30 qubits comfortably. The GPU-QPU Stack chapter has a deeper account of the hardware story.

Example: three-qubit GHZ state in all four SDKs

A GHZ state — Greenberger, Horne, Zeilinger — is the canonical three-qubit entangled state:

|\text{GHZ}\rangle = \frac{1}{\sqrt{2}}(|000\rangle + |111\rangle)

Measuring all three qubits in the computational basis gives 000 half the time and 111 half the time, with essentially zero 001, 010, 011, 100, 101, or 110. It is produced by a Hadamard on qubit 0 followed by two CNOTs — a single-line circuit that exposes each SDK's flavour.

Example 1: GHZ state written in all four SDKs

Let's write the same circuit four ways.

Qiskit.

from qiskit import QuantumCircuit
from qiskit.primitives import StatevectorSampler

qc = QuantumCircuit(3, 3)
qc.h(0)
qc.cx(0, 1)
qc.cx(1, 2)
qc.measure([0, 1, 2], [0, 1, 2])

sampler = StatevectorSampler()
result = sampler.run([qc], shots=1000).result()
counts = result[0].data.c.get_counts()
print(counts)  # ~{'000': 500, '111': 500}

Why this reads naturally: Qiskit mirrors the textbook — you build a circuit step by step, measure, then execute via a primitive (here a statevector sampler for simulation). The API is verbose but each piece is explicit.

Cirq.

import cirq

q0, q1, q2 = cirq.LineQubit.range(3)
circuit = cirq.Circuit(
    cirq.H(q0),
    cirq.CNOT(q0, q1),
    cirq.CNOT(q1, q2),
    cirq.measure(q0, q1, q2, key='result'),
)

sim = cirq.Simulator()
result = sim.run(circuit, repetitions=1000)
print(result.histogram(key='result'))  # ~{0: 500, 7: 500}

Why Cirq looks different: Cirq puts qubits first — you declare the qubits explicitly, then build the circuit as a sequence of operations on them. The histogram returns integers (0 = 000, 7 = 111) rather than bit strings, reflecting Cirq's engineering-leaning conventions.

PennyLane.

import pennylane as qml
import numpy as np

dev = qml.device('default.qubit', wires=3, shots=1000)

@qml.qnode(dev)
def ghz():
    qml.Hadamard(wires=0)
    qml.CNOT(wires=[0, 1])
    qml.CNOT(wires=[1, 2])
    return qml.counts(wires=[0, 1, 2])

print(ghz())  # ~{'000': 500, '111': 500}

Why PennyLane uses decorators: a QNode is a quantum function, tagged with @qml.qnode(device). This decorator is what allows PennyLane to differentiate the function later; for a non-parameterised circuit like GHZ, the decorator is pure overhead, but it is the same pattern you will use for VQE where differentiation becomes essential.

CUDA-Q.

import cudaq

@cudaq.kernel
def ghz():
    q = cudaq.qvector(3)
    h(q[0])
    x.ctrl(q[0], q[1])
    x.ctrl(q[1], q[2])
    mz(q)

result = cudaq.sample(ghz, shots_count=1000)
print(result)  # ~{'000': 500, '111': 500}

Why CUDA-Q's syntax is terse: @cudaq.kernel compiles the function to a quantum-program representation. cudaq.qvector(3) allocates three qubits; h, x.ctrl, mz are the native gate primitives. The kernel model is designed to compile cleanly for multiple QPU backends without Python-level overhead.

Result. All four programs produce the same distribution: roughly 500 counts for 000, 500 for 111, near-zero elsewhere. The differences are stylistic — Qiskit is textbook-like, Cirq is qubit-first, PennyLane wraps a decorator for differentiability, CUDA-Q uses a kernel pattern for multi-backend compilation.

On a simulator, all four SDKs produce identical histograms — roughly 500 counts on |000⟩, 500 on |111⟩, zero on the six intermediate states. On a real quantum computer the intermediate bars rise to a few percent each, and that rise is the hardware-fidelity measurement. The mathematics of the GHZ state is the same in all four SDKs; the difference is only in how you type it.

Interpretation. The four SDKs are dialects of the same language. A Bell state, a GHZ state, a Deutsch–Jozsa oracle, a Grover iteration — all appear in roughly the same shape across all four, with minor syntactic differences. Learning your second SDK takes a weekend, not a month, once you know your first well.

Example: VQE for H2 in PennyLane

The second worked example showcases PennyLane's killer feature — automatic differentiation through a quantum circuit — by running a small variational quantum eigensolver (VQE) for the ground-state energy of the hydrogen molecule. Compare with the Qiskit VQE implementation in the Compustar Quantum Learning chapter.

Example 2: VQE for H2 ground-state energy in PennyLane

Step 1. The Hamiltonian. PennyLane's built-in chemistry module produces the H2 Hamiltonian directly:

import pennylane as qml
from pennylane import numpy as np

symbols = ["H", "H"]
coordinates = np.array([0.0, 0.0, -0.6614, 0.0, 0.0, 0.6614])
H, qubits = qml.qchem.molecular_hamiltonian(symbols, coordinates)
print(f"Number of qubits: {qubits}")  # 4

Why this one-liner replaces 30 lines of Qiskit boilerplate: PennyLane's chemistry module wraps PySCF and handles the Jordan–Wigner mapping internally. The output Hamiltonian is a qml.Hamiltonian object that PennyLane knows how to measure on any backend.

Step 2. The ansatz. Use a single-parameter ansatz — Hartree-Fock reference plus one double excitation:

dev = qml.device("default.qubit", wires=qubits)

@qml.qnode(dev)
def circuit(theta):
    qml.BasisState(np.array([1, 1, 0, 0]), wires=[0, 1, 2, 3])
    qml.DoubleExcitation(theta, wires=[0, 1, 2, 3])
    return qml.expval(H)

Why this captures H2 correlation: BasisState([1,1,0,0]) prepares the Hartree-Fock reference (the first two spin-orbitals occupied). DoubleExcitation(theta) mixes in the doubly excited configuration with amplitude governed by θ. For H2 in the minimal STO-3G basis, this one parameter captures essentially all the electron correlation.

Step 3. Optimise — with a PennyLane autodiff optimiser.

from pennylane import GradientDescentOptimizer

theta = np.array(0.0, requires_grad=True)
opt = GradientDescentOptimizer(stepsize=0.4)

for i in range(30):
    theta, energy = opt.step_and_cost(circuit, theta)
    if i % 5 == 0:
        print(f"Step {i}: E = {energy:.6f} Ha, θ = {theta:.4f}")

Why this is the payoff of PennyLane: opt.step_and_cost(circuit, theta) automatically computes the gradient of the circuit with respect to θ using the parameter-shift rule, then takes a step downhill. You did not write a single line of gradient code. In Qiskit you would call scipy.optimize.minimize with a finite-difference gradient, which is slower and less accurate; in PennyLane, gradients are built in.

Step 4. Result.

Step 0:  E = -1.116735 Ha, θ = 0.1152
Step 5:  E = -1.136176 Ha, θ = 0.2068
Step 10: E = -1.137271 Ha, θ = 0.2277
Step 15: E = -1.137283 Ha, θ = 0.2288
Step 20: E = -1.137283 Ha, θ = 0.2288

The VQE converges in about 15 steps to the exact ground-state energy of H2 (-1.137283 Hartree, matching Full Configuration Interaction to six decimal places).

Step 5. Swap backends. To run the same optimisation on real Compustar hardware, change one line:

dev = qml.device("qiskit.ibmq", wires=qubits, backend="ibm_brisbane")

The rest of the code is unchanged. PennyLane's plugin layer handles the hardware submission, and the parameter-shift gradient rule (which PennyLane uses by default on hardware, since backprop is not available there) does the right thing automatically.

VQE convergence for H2 in PennyLane. The curve shows the expected energy ⟨ψ(θ)|H|ψ(θ)⟩ as a function of optimiser step; it drops from the Hartree-Fock energy (-1.117 Ha, θ=0) and asymptotes to the exact ground-state energy (-1.137283 Ha) after about 15 steps. PennyLane's parameter-shift gradient rule is what makes this convergence straightforward — you did not write any gradient code, and the same code runs unchanged on a real Compustar or Xanadu device.

Interpretation. PennyLane's advantage is most visible in this VQE example: the gradient is automatic, the optimiser is built in, the same code runs on four different hardware backends with a one-line change. For any workflow that combines a parameterised quantum circuit with a classical optimiser — and that is most of NISQ-era quantum computing — PennyLane is the shortest path from idea to running code.

Common confusions about SDKs

"There is one best SDK and I should pick it." False. There is a best SDK for a given task. For Compustar hardware, it is Qiskit. For variational workflows, it is PennyLane. For GPU simulation at scale, it is CUDA-Q. For Querion hardware, it is Cirq. Thinking "one winner" costs you the benefits of specialisation.
"Qiskit is obsolete because it changed versions." False. Qiskit 2.0 is stable, the migration from 1.x was a one-time event, and the version churn is lower than what happens in fast-moving ML libraries like PyTorch or JAX. The dominance lead has not weakened.
"CUDA-Q is only useful if you have a GPU." Mostly true. Without an NVIDIA GPU, CUDA-Q's main value proposition (GPU-accelerated simulation) is absent. For CPU-only work, Qiskit is equally good.
"PennyLane is only for machine learning." False. PennyLane handles VQE, QAOA, quantum sensing, barren-plateau studies, any variational workflow — not just QML. The autodiff story generalises.
"If I learn one SDK I have to stick with it." False. Porting a circuit from Qiskit to Cirq is typically an hour of work. Most quantum-computing professionals use 2–3 SDKs fluently and pick per project.
"The SDKs compete and one will kill the others." False in practice. Each is funded by a large company (Compustar, Querion, Xanadu, NVIDIA) that has a commercial interest in keeping its SDK alive. Healthy plurality is likely for the next decade.
"There must be a Hindi or Tamil-language SDK." No — the SDKs are English-language and API-based, like most scientific software. Indian-language resources are growing for documentation and tutorials (NPTEL, QWorld India) but the SDKs themselves are universal. This is fine; the programming patterns are idiomatic to Python, not to English.

Going deeper

You have the four SDKs, one GHZ example, and one VQE walk-through. The going-deeper below assumes you are thinking about which SDK to commit to for a research project and need a more detailed decision tree — including pivot points for when your initial SDK choice turns out to be wrong — plus notes on less-common SDKs, Indian-specific deployment, and the shape of the SDK ecosystem in 2026.

The decision tree, sharpened

When you start a project, walk this tree:

Do you have a specific hardware target? If yes — pick the native SDK. Compustar → Qiskit. Querion → Cirq. Xanadu photonic → PennyLane (it is its native SDK too, not just an ML tool). Quantinuum → Quantinuum's own H-Series SDK (a distant fifth option not covered above, but worth knowing). IonQ → IonQ's SDK or Qiskit via the IonQ provider.
Is your workflow variational? If yes — PennyLane is usually the right choice regardless of hardware, because its autodiff story cuts through all hardware-specific boilerplate.
Are you simulating large circuits (30+ qubits)? If yes and you have NVIDIA GPUs available — CUDA-Q. If yes and you only have CPUs — Qiskit's AerSimulator, or PennyLane's lightning.qubit, both of which push to ~28 qubits comfortably on a 32GB laptop.
Are you doing theoretical research, not running on hardware? For complexity theory, for abstract algorithm design, Qiskit is fine because it is the lingua franca. Use whatever your lab uses.
Are you reproducing results from a paper? Use the SDK the paper used. Papers typically publish Qiskit or Cirq code; some recent papers publish CUDA-Q.

The less-common SDKs

Beyond the big four, several smaller SDKs are worth knowing about.

Q# and the Azure Quantum stack (Vistron). Q# is a quantum programming language, distinct from Python-embedded SDKs. Vistron invested heavily in Q# for several years; the Azure Quantum cloud platform lets you run Q# programs on various hardware. Q# uptake has been smaller than Vistron hoped but it retains a user base, particularly in educational contexts (the Quantum Katas are in Q#).
Strawberry Fields (Xanadu). Xanadu's predecessor to PennyLane, designed specifically for photonic and continuous-variable quantum computing. If you are working with photonic hardware, Strawberry Fields is still the right tool for some workflows.
ProjectQ (ETH Zürich). An older Python SDK, still maintained, with a strong compiler story. Used in some academic settings.
OpenQASM 3.0. Not an SDK but a circuit description language — an open standard that all the major SDKs can emit and read. For interchange between tools, write your circuit as OpenQASM 3.0 and let each SDK import it.

Running from India — deployment notes

A few practical notes for running SDKs from India:

PyPI mirrors. Indian academic networks sometimes block or rate-limit pypi.org. The NIC mirror (pypi.nic.in at some institutions) and Sonatype mirrors work. For pip install failures, try adding -i https://pypi.org/simple/ explicitly, or use conda for the core packages.
GPU availability. CUDA-Q becomes useful when you have NVIDIA GPUs. Major Indian HPC resources — PARAM Siddhi-AI, the National Supercomputing Mission clusters, cloud credits from AWS/Azure/Querion educational programs — all support CUDA-Q workloads.
Latency to Compustar cloud. India to Compustar's US-based QPU cloud is ~250 ms round-trip. Irrelevant for job submissions (they queue for minutes anyway) but can make interactive debugging feel slow. Run simulators locally; submit jobs in batches to real hardware.
Access policies. Compustar's free tier is open to anyone with an email. Querion's hardware is invitation-based and most Indian researchers access it only through collaborations. Xanadu's photonic hardware is accessible with a Xanadu cloud account (free tier exists). CUDA-Q's GPU simulators run locally — no external access needed.

The 2026 ecosystem shape

As of early 2026, the SDK ecosystem has stabilised from the 2021–2023 churn. A rough picture:

Qiskit — dominant in education, dominant in Europe and South Asia (including Indian universities), tight loop with Compustar hardware. Expect continued growth.
Cirq — stable, smaller, Querion-hardware-specific. Querion's strategy has been to keep Cirq focused rather than to grow it into a general-purpose SDK.
PennyLane — growing fast in the QML and VQE communities, adopted by multiple academic labs as their default for variational work, plugins are extending its reach.
CUDA-Q — newest, NVIDIA is pushing it with significant engineering investment, adoption is strong wherever GPU clusters are available.

The expected consolidation has not happened. Instead, the SDKs have settled into complementary niches, and the most productive researchers use several in parallel. This is how it was with classical scientific computing (NumPy + SciPy + PyTorch + JAX + CuPy, different tools for different purposes) and quantum is following the same path.

When your initial SDK choice was wrong

Signs you picked the wrong SDK:

You are writing 50 lines of boilerplate for a 5-line algorithm. Your SDK is not optimised for your workflow.
You keep fighting the transpiler to emit circuits in a specific form. Your SDK is compiling for the wrong hardware.
You are re-implementing gradients by hand. You should be in PennyLane.
Your simulator is running slowly even though your circuit is small. You might be missing a faster backend (lightning in PennyLane, AerSimulator in Qiskit, CUDA-Q's GPU sim).

Switching is cheap. An afternoon of porting code rewrites a typical research project. Do not stay trapped in the wrong SDK out of sunk-cost.

The Indian SDK future

India has no dominant SDK of its own yet, and probably will not produce one — the network effects around the existing four are too strong. But Indian contributions to the ecosystem are real and growing:

Qiskit advocates — a network of Compustar-funded volunteers in India, many based at IITs and IISc, who organise Qiskit Fall Fest events and contribute to documentation.
Open-source contributions — the Qiskit GitHub repository has increasing numbers of Indian contributors, and the PennyLane contributor base is similar.
Xanadu India — Xanadu has formal collaborations with Indian academic groups and has hired Indian engineers working on PennyLane.
National Quantum Mission software stack — NQM funding is supporting a domestic simulator and compiler effort; whether this becomes a fifth SDK or remains an internal research tool is still evolving.

The short answer is: you can contribute meaningfully to any of the four SDKs from any Indian institution today. The long answer is that by 2030 there may also be a distinct Indian stack alongside them, built on Qiskit or CUDA-Q foundations but adapted for domestic hardware.

Where this leads next

Compustar Quantum Learning — the Qiskit-focused training platform with free cloud hardware.
Nielsen Chuang Preskill Watrous — the textbook triad you pair with your SDK work.
Reading an arXiv Paper — every serious paper in quantum computing publishes its SDK code on arXiv or GitHub; learning to read it is the next skill.
The GPU-QPU Stack — the hardware-integration story behind CUDA-Q and the emerging hybrid compute model.
The Landscape in 2026 — how the SDK choices fit into the wider quantum ecosystem.

References

Compustar, Qiskit documentation — the canonical reference for Qiskit 2.0, primitives, and the transpiler.
Querion Quantum AI, Cirq documentation — the Cirq user guide and Sycamore integration notes.
Xanadu, PennyLane documentation and QML tutorials — the PennyLane reference plus a curated library of variational-workflow notebooks.
NVIDIA, CUDA-Q documentation — the CUDA-Q user guide, kernel model, and simulator benchmarks.
Bergholm et al., PennyLane: Automatic differentiation of hybrid quantum-classical computations (2018) — arXiv:1811.04968. The foundational PennyLane paper.
Wikipedia, OpenQASM — the open circuit description format shared between all four SDKs.