Raspberry Pi programmable IO pitfalls illustrated with a musical example
Also available: A MicroPython version of this article
In JavaScript and other languages, we call a surprising or inconsistent behavior a “Wat!” [that is, a “What!?”]. For example, in JavaScript, an empty array plus an empty array produces an empty string, [] + [] === “”. Wat!
Rust, by comparison, is consistent and predictable. However, one corner of Rust on the Raspberry Pi Pico microcontroller offers similar surprises. Specifically, the Pico’s Programmable Input/Output (PIO) subsystem, while incredibly powerful and versatile, comes with peculiarities.
PIO programming matters because it provides an ingenious solution to the challenge of precise, low-level hardware control. It is incredibly fast and flexible: rather than relying on special-purpose hardware for the countless peripherals you might want to control, PIO allows you to define custom behaviors in software, seamlessly adapting to your needs without adding hardware complexity.
Consider this simple example: a $15 theremin-like musical instrument. By waving their hand in the air, the musician changes the pitch of (admittedly annoying) tones. Using PIO provides a simple way to program this device that ensures it reacts instantly to movement.
https://medium.com/media/26c2c5e54d35aff78d866c70bdc657ad/href
So, all is wonderful, except — to paraphrase Spider-Man:
With great power comes… nine Wats!?
We’ll explore and illustrate those nine PIO Wats through the creation of this theremin.
Who Is This Article For?
- All Programmers: Microcontrollers like the Pico cost under $7 and support high-level languages like Python, Rust, and C/C++. This article will show how microcontrollers let your programs interact with the physical world and introduce you to programming the Pico’s low-level, high-performance PIO hardware.
- Rust Pico Programmers: Curious about the Pico’s hidden potential? Beyond its two main cores, it has eight tiny “state machines” dedicated to PIO programming. These state machines take over time-critical tasks, freeing up the main processors for other work and enabling surprising parallelism.
- C/C++ Pico Programmers: While this article uses Rust, PIO programming is — for good and bad — nearly identical across all languages. If you understand it here, you’ll be well-equipped to apply it in C/C++.
- MicroPython Pico Programmers: You may wish to read the MicroPython version of this article.
- PIO Programmers: The journey through nine Wats may not be as entertaining as JavaScript’s quirks (thankfully), but it will shed light on the peculiarities of PIO programming. If you’ve ever found PIO programming confusing, this article should reassure you that the problem isn’t (necessarily) you — it’s partly PIO itself. Most importantly, understanding these Wats will make writing PIO code simpler and more effective.
Finally, this article isn’t about “fixing” PIO programming. PIO excels at its primary purpose: efficiently and flexibly handling custom peripheral interfaces. Its design is purposeful and well-suited to its goals. Instead, this article focuses on understanding PIO programming and its quirks — starting with a bonus Wat.
Bonus Wat 0: “State Machines” Are Not State Machines
Despite their name, the eight “PIO state machines” in the Raspberry Pi Pico are not state machines in the formal computer science sense. Instead, they are tiny programmable processors with their own assembly-like instruction set, capable of looping, branching, and conditional operations. In reality, they are closer to Harvard architecture machines or, like most practical computers, von Neumann machines.
Each state machine processes one instruction per clock cycle. The $4 Pico 1 runs at 125 million cycles per second, while the $5 Pico 2 offers a faster 150 million cycles per second. Each instruction performs a simple operation, such as “move a value” or “jump to a label”.
With that bonus Wat out of the way, let’s move to our first main Wat.
Wat 1: The Register Hunger Games
In PIO programming, a register is a small, fast storage location that acts like a variable for the state machine. You might dream of an abundance of variables to hold your counters, delays, and temporary values, but the reality is brutal: you only get two general-purpose registers, x and y. It’s like The Hunger Games, where no matter how many tributes enter the arena, only Katniss and Peeta emerge as victors. You’re forced to winnow down your needs to fit within these two registers, ruthlessly deciding what to prioritize and what to sacrifice. Also, like the Hunger Games, we can sometimes bend the rules.
Let’s start with a challenge: create a backup beeper — 1000 Hz for ½ second, silence for ½ second, repeat. The result? “Beep Beep Beep…”
We would like five variables:
- half_period: The number of clock cycles to hold the voltage high and then low to create a 1000 Hz tone. This is 125,000,000 / 1000 / 2 = 62,500 cycles high and 62, 500 cycles low.
- y: Loop counter from 0 to half_period to create a delay.
- period_count: The number of repeated periods needed to fill ½ second of time. 125,000,000 × 0.5 / (62,500 × 2) = 500.
- x: Loop counter from 0 to period_count to fill ½ second of time.
- silence_cycles: The number of clock cycles for ½ second of silence. 125,000,000 × 0.5 = 62,500,000.
We want five registers but can only have two, so let the games begin! May the odds be ever in your favor.
First, we can eliminate silence_cycles because it can be derived as half_period × period_count × 2. While PIO doesn’t support multiplication, it does support loops. By nesting two loops—where the inner loop delays for 2 clock cycles—we can create a delay of 62,500,000 clock cycles.
One variable down, but how can we eliminate two more? Fortunately, we don’t have to. While PIO only provides two general-purpose registers, x and y, it also includes two special-purpose registers: osr (output shift register) and isr (input shift register).
The PIO code that we’ll see in a moment implements the backup beeper. Here’s how it works:
Initialization:
- The pull block instruction reads the half period of the tone (62,500 clock cycles) from a buffer and places the value into osr.
- The value is then copied to isr for later use.
- The second pull block reads the period count (500 repeats) from the buffer and places the value in osr, where we leave it.
Beep Loops:
- The mov x, osr instruction copies the period count into the x register, which serves as the outer loop counter.
- For the inner loops, mov y, isr repeatedly copies the half period into y to create delays for the high and low states of the tone.
Silence Loops:
- The silence loops mirror the structure of the beep loops but don’t set any pins, so they act solely as a delay.
Wrap and Continuous Execution:
- The .wrap_target and .wrap directives define the main loop of the state machine.
- After finishing both the beep and silence loops, the state machine jumps back near the start of the program, repeating the sequence indefinitely.
With this outline in mind, here’s the PIO assembly code for generating the backup beeper signal.
.program backup
; Read initial configuration
pull block ; Read the half period of the beep sound
mov isr, osr ; Store the half period in ISR
pull block ; Read the period_count
.wrap_target ; Start of the main loop
; Generate the beep sound
mov x, osr ; Load period_count into X
beep_loop:
set pins, 1 ; Set the buzzer to high voltage (start the tone)
mov y, isr ; Load the half period into Y
beep_high_delay:
jmp y--, beep_high_delay ; Delay for the half period
set pins, 0 ; Set the buzzer to low voltage (end the tone)
mov y, isr ; Load the half period into Y
beep_low_delay:
jmp y--, beep_low_delay ; Delay for the low duration
jmp x--, beep_loop ; Repeat the beep loop
; Silence between beeps
mov x, osr ; Load the period count into X for outer loop
silence_loop:
mov y, isr ; Load the half period into Y for inner loop
silence_delay:
jmp y--, silence_delay [1] ; Delay for two clock cycles (jmp + 1 extra)
jmp x--, silence_loop ; Repeat the silence loop
.wrap ; End of the main loop, jumps back to wrap_target
Here’s the core Rust code to configure and run the PIO program for the backup beeper. It uses the Embassy framework for embedded applications. The function initializes the state machine, calculates the timing values (half_period and period_count), and sends them to the PIO. It then plays the beeping sequence for 5 seconds before entering an endless loop. The full source file and project are available on GitHub.
async fn inner_main(_spawner: Spawner) -> Result<Never> {
info!("Hello, back_up!");
let hardware: Hardware<'_> = Hardware::default();
let mut pio0 = hardware.pio0;
let state_machine_frequency = embassy_rp::clocks::clk_sys_freq();
let mut back_up_state_machine = pio0.sm0;
let buzzer_pio = pio0.common.make_pio_pin(hardware.buzzer);
back_up_state_machine.set_pin_dirs(Direction::Out, &[&buzzer_pio]);
back_up_state_machine.set_config(&{
let mut config = Config::default();
config.set_set_pins(&[&buzzer_pio]); // For set instruction
let program_with_defines = pio_file!("examples/backup.pio");
let program = pio0.common.load_program(&program_with_defines.program);
config.use_program(&program, &[]);
config
});
back_up_state_machine.set_enable(true);
let half_period = state_machine_frequency / 1000 / 2;
let period_count = state_machine_frequency / (half_period * 2) / 2;
info!(
"Half period: {}, Period count: {}",
half_period, period_count
);
back_up_state_machine.tx().wait_push(half_period).await;
back_up_state_machine.tx().wait_push(period_count).await;
Timer::after(Duration::from_millis(5000)).await;
info!("Disabling back_up_state_machine");
back_up_state_machine.set_enable(false);
// run forever
loop {
Timer::after(Duration::from_secs(3_153_600_000)).await; // 100 years
}
}
Here’s what happens when you run the program:
https://medium.com/media/1bea567079143d6da85e57edc2eec35b/href
Aside 1: Running this yourself
The simplest — but often frustrating — way to run Rust code on the Pico is to cross-compile it on your desktop and manually copy over the resulting files. A much better approach is to invest in a $12 Raspberry Pi Debug Probe and set up probe-rs on your desktop. With this setup, you can use cargo run to automatically compile on your desktop, copy to your Pico, and then start your code running. Even better, your Pico code can use info! statements to send messages back to your desktop, and you can perform interactive breakpoint debugging. For setup instructions, visit the probe-rs website.
To hear sound, I connected a passive buzzer, a resistor, and a transistor to the Pico. For detailed wiring diagrams and a parts list, check out the passive buzzer instructions in the SunFounder’s Kepler Kit.
Aside 2: If your only goal is to generate tones with the Pico, PIO isn’t necessary. MicroPython is fast enough to toggle pins directly, or you can use the Pico’s built-in pulse width modulation (PWM) feature.
Alternative Endings to the Register Hunger Games
We used four registers — two general and two special — to resolve the challenge. If this solution feels less than satisfying, here are alternative approaches to consider:
Use Constants: Why make half_period, period_count, and silence_cycles variables at all? Hardcoding the constants “62,500,” “500,” and “62,500,000” could simplify the design. However, PIO constants have limitations, which we’ll explore in Wat 5.
Pack Bits: Registers hold 32 bits. Do we really need two registers (2×32=64 bits) to store half_period and period_count? No. Storing 62,500 only requires 16 bits, and 500 requires 9 bits. We could pack these into a single register and use the out instruction to shift values into x and y. This approach would free up either osr or isr for other tasks, but only one at a time—the other register must hold the packed value.
Slow Motion: In Rust with the Embassy framework, you can configure a PIO state machine to run at a slower frequency by setting its clock_divider. This allows the state machine to run as slow as ~1907 Hz. Running the state machine at a slower speed means that values like half_period can be smaller, potentially as small as 2. Small values are easier to hardcode as constants and more compactly bit-packed into registers.
A Happy Ending to the Register Hunger Games
The Register Hunger Games demanded strategic sacrifices and creative workarounds, but we emerged victorious by leveraging PIO’s special registers and clever looping structures. If the stakes had been higher, alternative techniques could have helped us adapt and survive.
But victory in one arena doesn’t mean the challenges are over. In the next Wat, we face a new trial: PIO’s strict 32-instruction limit.
Wat 2: The 32-Instruction Carry-On Suitcase
Congratulations! You’ve purchased a trip around the world for just $4. The catch? All your belongings must fit into a tiny carry-on suitcase. Likewise, PIO programs allow you to create incredible functionality, but every PIO program is limited to just 32 instructions.
Wat! Only 32 instructions? That’s not much space to pack everything you need! But with clever planning, you can usually make it work.
The Rules
- No PIO program can be longer than 32 instructions.
- The wrap_target and wrap directives do not count.
- Labels do not count.
- A Pico 1 includes eight state machines, organized into two blocks of four. A Pico 2 includes twelve state machines, organized into three blocks of four. Each block shares 32 instruction slots. So, because all four state machines in a block draw from the same 32-instruction pool, if one machine’s program uses all 32 slots, there’s no space left for the other three.
When Your Suitcase Won’t Close
If your idea doesn’t fit in the PIO instruction slots, these packing tricks may help. (Disclaimer: I haven’t tried all of these myself.)
- Swap PIO Programs on the Fly:
Instead of trying to cram everything into one program, consider swapping out programs mid-flight. Load only what you need, when you need it. - Share Programs Across State Machines:
Multiple state machines can run the same program at the same time. Each state machine can make the shared program behave differently based on an input value. - Use Rust/Embassy’s exec_instr Command:
Save space by offloading instructions to Rust. For example, you can execute initialization steps before enabling the state machine:
let half_period = state_machine_frequency / 1000 / 2;
back_up_state_machine.tx().push(half_period); // Using non-blocking push since FIFO is empty
let pull_block = pio_asm!("pull block").program.code[0];
unsafe {
back_up_state_machine.exec_instr(pull_block);
}
- Use PIO’s exec commands:
Within your state machine, you can dynamically execute instructions using PIO’s exec mechanism. For example, you can execute an instruction value stored in osr with out exec. Alternatively, you can use mov exec, x or mov exec, y to execute instructions directly from those registers. - Offload to the Main Processors:
If all else fails, move more of your program to the Pico’s larger dual processors — think of this as shipping your extra baggage to your destination separately. The Pico SDK (section 3.1.4) calls this “bit banging”.
With your bags now packed, let’s join Dr. Dolittle’s search for a fabled creature.
Bonus Wat 2.5: Dr. Dolittle’s PIO Pushmi-Pullyu
Two readers pointed out an important PIO Wat that I missed — so here’s a bonus! When programming PIO, you’ll notice something peculiar:
- The PIO pull instruction receives values from TX FIFO (transmit buffer) and inputs them into the output shift register (osr). So, it inputs into output and transmits from receive.
- Likewise, the PIO push instruction outputs values from the input shift register (isr) and transmits them to the RX FIFO (receive buffer). So, it outputs from input and receives from transmit.
Wat!? Like the two-headed Pushmi-Pullyu from the Dr. Dolittle stories, something seems backwards. But it starts to make sense when you realize PIO names most things from the host’s perspective (MicroPython, Rust, C/C++), not the point of view of the PIO program.
This table summarizes the instructions, registers, and buffer names. (“FIFO” stands for first-in-first-out.)
With the Pushmi-Pullyu in hand, we next move to the scene of a mystery.
Wat 3: The pull noblock Mystery
In Wat 1, we programmed our audio hardware as a backup beeper. But that’s not what we need for our musical instrument. Instead, we want a PIO program that plays a given tone indefinitely — until it’s told to play a new one. The program should also wait silently when given a special “rest” tone.
Resting until a new tone is provided is easy to program with pull block—we’ll explore the details below. Playing a tone at a specific frequency is also straightforward, building on the work we did in Wat 1.
But how can we check for a new tone while continuing to play the current one? The answer lies in using “noblock” instead of “block” in pull noblock. Now, if there’s a new value, it will be loaded into osr, allowing the program to update seamlessly.
Here’s where the mystery begins: what happens to osr if pull noblock is called and there’s no new value?
I assumed it would keep its previous value. Wrong! Maybe it gets reset to 0? Wrong again! The surprising truth: it gets the value of x. Why? (No, not y — x.) Because the Pico SDK says so. Specifically, section 3.4.9.2 explains:
A nonblocking PULL on an empty FIFO has the same effect as MOV OSR, X.
Knowing how pull noblock works is important, but there’s a bigger lesson here. Treat the Pico SDK documentation like the back of a mystery novel. Don’t try to solve everything on your own—cheat! Skip to the “who done it” section, and in section 3.4, read the fine details for each command you use. Reading just a few paragraphs can save you hours of confusion.
Aside: When even the SDK documentation feels unclear, turn to the RP2040 (Pico 1) and RP2350 (Pico 2) datasheets. These encyclopedias — 600 and 1,300 pages respectively — are like omnipotent narrators: they provide the ground truth.
With this in mind, let’s look at a practical example. Below is the PIO program for playing tones and rests continuously. It uses pull block to wait for input during a rest and pull noblock to check for updates while playing a tone.
.program sound
; Rest until a new tone is received.
resting:
pull block ; Wait for a new delay value
mov x, osr ; Copy delay into X
jmp !x resting ; If delay is zero, keep resting
; Play the tone until a new delay is received.
.wrap_target ; Start of the main loop
set pins, 1 ; Set the buzzer high voltage.
high_voltage_loop:
jmp x-- high_voltage_loop ; Delay
set pins, 0 ; Set the buzzer low voltage.
mov x, osr ; Load the half period into X.
low_voltage_loop:
jmp x-- low_voltage_loop ; Delay
; Read any new delay value. If none, keep the current delay.
mov x, osr ; set x, the default value for "pull(noblock)"
pull noblock ; Read a new delay value or use the default.
; If the new delay is zero, rest. Otherwise, continue playing the tone.
mov x, osr ; Copy the delay into X.
jmp !x resting ; If X is zero, rest.
.wrap ; Continue playing the sound.
We’ll eventually use this PIO program in our theremin-like musical instrument. For now, let’s see the PIO program in action by playing a familiar melody. This demo uses “Twinkle, Twinkle, Little Star” to show how you can control a melody by feeding frequencies and durations to the state machine. With just this code (full file and project), you can make the Pico sing!
const TWINKLE_TWINKLE: [(u32, u64, &str); 16] = [
// Bar 1
(262, 400, "Twin-"), // C
(262, 400, "-kle"), // C
(392, 400, "twin-"), // G
(392, 400, "-kle"), // G
(440, 400, "lit-"), // A
(440, 400, "-tle"), // A
(392, 800, "star"), // G
(0, 400, ""), // rest
// Bar 2
(349, 400, "How"), // F
(349, 400, "I"), // F
(330, 400, "won-"), // E
(330, 400, "-der"), // E
(294, 400, "what"), // D
(294, 400, "you"), // D
(262, 800, "are"), // C
(0, 400, ""), // rest
];
async fn inner_main(_spawner: Spawner) -> Result<Never> {
info!("Hello, sound!");
let hardware: Hardware<'_> = Hardware::default();
let mut pio0 = hardware.pio0;
let state_machine_frequency = embassy_rp::clocks::clk_sys_freq();
let mut sound_state_machine = pio0.sm0;
let buzzer_pio = pio0.common.make_pio_pin(hardware.buzzer);
sound_state_machine.set_pin_dirs(Direction::Out, &[&buzzer_pio]);
sound_state_machine.set_config(&{
let mut config = Config::default();
config.set_set_pins(&[&buzzer_pio]); // For set instruction
let program_with_defines = pio_file!("examples/sound.pio");
let program = pio0.common.load_program(&program_with_defines.program);
config.use_program(&program, &[]);
config
});
sound_state_machine.set_enable(true);
for (frequency, ms, lyrics) in TWINKLE_TWINKLE.iter() {
if *frequency > 0 {
let half_period = state_machine_frequency / frequency / 2;
info!("{} -- Frequency: {}", lyrics, frequency);
// Send the half period to the PIO state machine
sound_state_machine.tx().wait_push(half_period).await;
Timer::after(Duration::from_millis(*ms)).await; // Wait as the tone plays
sound_state_machine.tx().wait_push(0).await; // Stop the tone
Timer::after(Duration::from_millis(50)).await; // Give a short pause between notes
} else {
sound_state_machine.tx().wait_push(0).await; // Play a silent rust
Timer::after(Duration::from_millis(*ms + 50)).await; // Wait for the rest duration + a short pause
}
}
info!("Disabling sound_state_machine");
sound_state_machine.set_enable(false);
// run forever
loop {
Timer::after(Duration::from_secs(3_153_600_000)).await; // 100 years
}
}
Here’s what happens when you run the program:
https://medium.com/media/b76f2dc0c295f64853de7608d763c3fb/href
We’ve solved one mystery, but there’s always another challenge lurking around the corner. In Wat 4, we’ll explore what happens when your smart hardware comes with a catch — it’s also very cheap.
Wat 4: Smart, Cheap Hardware: An Emotional Roller Coaster
With sound working, we turn next to measuring the distance to the musician’s hand using the HC-SR04+ ultrasonic range finder. This small but powerful device is available for less than two dollars.
HC-SR04+ Range Finder (Pen added for scale.)
This little peripheral took me on an emotional roller coaster of “Wats!?”:
- Up: Amazingly, this $2 range finder includes its own microcontroller, making it smarter and easier to use.
- Down: Frustratingly, that same “smart” behavior is unintuitive.
- Up: Conveniently, the Pico can supply peripherals with either 3.3V or 5V power.
- Down: Unpredictably, many range finders are unreliable — or fail outright — at 3.3V, and they can damage your Pico at 5V.
- Up: Thankfully, both damaged range finders and Picos are inexpensive to replace, and a dual-voltage version of the range finder solved my problems.
Details
I initially assumed the range finder would set the Echo pin high when the echo returned. I was wrong.
Instead, the range finder emits a pattern of 8 ultrasonic pulses at 40 kHz (think of it as a backup beeper for dogs). Immediately after, it sets Echo high. The Pico should then start measuring the time until Echo goes low, which signals that the sensor detected the pattern — or that it timed out.
As for voltage, the documentation specifies the range finder operates at 5V. It seemed to work at 3.3V — until it didn’t. Around the same time, while my Pico kept working with Rust (via the Debug Probe and probe-rs), it stopped working with any of the MicroPython IDEs, which rely on a special USB protocol.
So, at this point both the Pico and the range finder were damaged.
After experimenting with various cables, USB drivers, programming languages, and even an older 5V-only range finder, I finally resolved the issue by:
- Continuing to use this Pico with Rust but switching to another Pico for MicroPython.
- Buying a new dual-voltage 3.3/5V range finder, still just $2 per piece.
Wat 4: Lessons Learned
As the roller coaster return to the station, I learned two key lessons. First, thanks to microcontrollers, even simple hardware can behave in non-intuitive ways that require careful reading of the documentation. Second, while this hardware is clever, it’s also inexpensive — and that means it is prone to failure. When it fails, take a deep breath, remember it’s only a few dollars, and replace it.
Hardware quirks, however, are only part of the story. In Wat 5, in Part 2, we’ll shift our focus back to software: the PIO programming language itself. We’ll uncover a behavior so unexpected, it might leave you questioning everything you thought you knew about constants.
Those are the first four Wats from programming the Pico PIO with MicroPython. You can find the code for the project on GitHub.
In Part 2 (expected next week), we’ll explore Wats 5 through 9. These will cover inconstant constants, conditions through the looking glass, overshooting jumps, too many pins, and kludgy debugging. We’ll also unveil the code for the finished musical instrument.
Follow me on Medium to get notified about this and future articles. I write on scientific programming in Rust and Python, machine learning, and statistics. I typically post one article a month.
Nine Pico PIO Wats with Rust (Part 1) was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Raspberry Pi programmable IO pitfalls illustrated with a musical examplePico PIO Surprises — Source: https://openai.com/dall-e-2/. All other figures from the author.Also available: A MicroPython version of this articleIn JavaScript and other languages, we call a surprising or inconsistent behavior a “Wat!” [that is, a “What!?”]. For example, in JavaScript, an empty array plus an empty array produces an empty string, [] + [] === “”. Wat!Rust, by comparison, is consistent and predictable. However, one corner of Rust on the Raspberry Pi Pico microcontroller offers similar surprises. Specifically, the Pico’s Programmable Input/Output (PIO) subsystem, while incredibly powerful and versatile, comes with peculiarities.PIO programming matters because it provides an ingenious solution to the challenge of precise, low-level hardware control. It is incredibly fast and flexible: rather than relying on special-purpose hardware for the countless peripherals you might want to control, PIO allows you to define custom behaviors in software, seamlessly adapting to your needs without adding hardware complexity.Consider this simple example: a $15 theremin-like musical instrument. By waving their hand in the air, the musician changes the pitch of (admittedly annoying) tones. Using PIO provides a simple way to program this device that ensures it reacts instantly to movement.https://medium.com/media/26c2c5e54d35aff78d866c70bdc657ad/hrefSo, all is wonderful, except — to paraphrase Spider-Man:With great power comes… nine Wats!?We’ll explore and illustrate those nine PIO Wats through the creation of this theremin.Who Is This Article For?All Programmers: Microcontrollers like the Pico cost under $7 and support high-level languages like Python, Rust, and C/C++. This article will show how microcontrollers let your programs interact with the physical world and introduce you to programming the Pico’s low-level, high-performance PIO hardware.Rust Pico Programmers: Curious about the Pico’s hidden potential? Beyond its two main cores, it has eight tiny “state machines” dedicated to PIO programming. These state machines take over time-critical tasks, freeing up the main processors for other work and enabling surprising parallelism.C/C++ Pico Programmers: While this article uses Rust, PIO programming is — for good and bad — nearly identical across all languages. If you understand it here, you’ll be well-equipped to apply it in C/C++.MicroPython Pico Programmers: You may wish to read the MicroPython version of this article.PIO Programmers: The journey through nine Wats may not be as entertaining as JavaScript’s quirks (thankfully), but it will shed light on the peculiarities of PIO programming. If you’ve ever found PIO programming confusing, this article should reassure you that the problem isn’t (necessarily) you — it’s partly PIO itself. Most importantly, understanding these Wats will make writing PIO code simpler and more effective.Finally, this article isn’t about “fixing” PIO programming. PIO excels at its primary purpose: efficiently and flexibly handling custom peripheral interfaces. Its design is purposeful and well-suited to its goals. Instead, this article focuses on understanding PIO programming and its quirks — starting with a bonus Wat.Bonus Wat 0: “State Machines” Are Not State MachinesDespite their name, the eight “PIO state machines” in the Raspberry Pi Pico are not state machines in the formal computer science sense. Instead, they are tiny programmable processors with their own assembly-like instruction set, capable of looping, branching, and conditional operations. In reality, they are closer to Harvard architecture machines or, like most practical computers, von Neumann machines.Each state machine processes one instruction per clock cycle. The $4 Pico 1 runs at 125 million cycles per second, while the $5 Pico 2 offers a faster 150 million cycles per second. Each instruction performs a simple operation, such as “move a value” or “jump to a label”.With that bonus Wat out of the way, let’s move to our first main Wat.Wat 1: The Register Hunger GamesIn PIO programming, a register is a small, fast storage location that acts like a variable for the state machine. You might dream of an abundance of variables to hold your counters, delays, and temporary values, but the reality is brutal: you only get two general-purpose registers, x and y. It’s like The Hunger Games, where no matter how many tributes enter the arena, only Katniss and Peeta emerge as victors. You’re forced to winnow down your needs to fit within these two registers, ruthlessly deciding what to prioritize and what to sacrifice. Also, like the Hunger Games, we can sometimes bend the rules.Let’s start with a challenge: create a backup beeper — 1000 Hz for ½ second, silence for ½ second, repeat. The result? “Beep Beep Beep…”We would like five variables:half_period: The number of clock cycles to hold the voltage high and then low to create a 1000 Hz tone. This is 125,000,000 / 1000 / 2 = 62,500 cycles high and 62, 500 cycles low.Voltage and timing (millisecond and clock cycles) to generate a 1000 Hz square tone.y: Loop counter from 0 to half_period to create a delay.period_count: The number of repeated periods needed to fill ½ second of time. 125,000,000 × 0.5 / (62,500 × 2) = 500.x: Loop counter from 0 to period_count to fill ½ second of time.silence_cycles: The number of clock cycles for ½ second of silence. 125,000,000 × 0.5 = 62,500,000.We want five registers but can only have two, so let the games begin! May the odds be ever in your favor.First, we can eliminate silence_cycles because it can be derived as half_period × period_count × 2. While PIO doesn’t support multiplication, it does support loops. By nesting two loops—where the inner loop delays for 2 clock cycles—we can create a delay of 62,500,000 clock cycles.One variable down, but how can we eliminate two more? Fortunately, we don’t have to. While PIO only provides two general-purpose registers, x and y, it also includes two special-purpose registers: osr (output shift register) and isr (input shift register).The PIO code that we’ll see in a moment implements the backup beeper. Here’s how it works:Initialization:The pull block instruction reads the half period of the tone (62,500 clock cycles) from a buffer and places the value into osr.The value is then copied to isr for later use.The second pull block reads the period count (500 repeats) from the buffer and places the value in osr, where we leave it.Beep Loops:The mov x, osr instruction copies the period count into the x register, which serves as the outer loop counter.For the inner loops, mov y, isr repeatedly copies the half period into y to create delays for the high and low states of the tone.Silence Loops:The silence loops mirror the structure of the beep loops but don’t set any pins, so they act solely as a delay.Wrap and Continuous Execution:The .wrap_target and .wrap directives define the main loop of the state machine.After finishing both the beep and silence loops, the state machine jumps back near the start of the program, repeating the sequence indefinitely.With this outline in mind, here’s the PIO assembly code for generating the backup beeper signal..program backup; Read initial configuration pull block ; Read the half period of the beep sound mov isr, osr ; Store the half period in ISR pull block ; Read the period_count.wrap_target ; Start of the main loop; Generate the beep sound mov x, osr ; Load period_count into Xbeep_loop: set pins, 1 ; Set the buzzer to high voltage (start the tone) mov y, isr ; Load the half period into Ybeep_high_delay: jmp y–, beep_high_delay ; Delay for the half period set pins, 0 ; Set the buzzer to low voltage (end the tone) mov y, isr ; Load the half period into Ybeep_low_delay: jmp y–, beep_low_delay ; Delay for the low duration jmp x–, beep_loop ; Repeat the beep loop; Silence between beeps mov x, osr ; Load the period count into X for outer loopsilence_loop: mov y, isr ; Load the half period into Y for inner loopsilence_delay: jmp y–, silence_delay [1] ; Delay for two clock cycles (jmp + 1 extra) jmp x–, silence_loop ; Repeat the silence loop.wrap ; End of the main loop, jumps back to wrap_targetHere’s the core Rust code to configure and run the PIO program for the backup beeper. It uses the Embassy framework for embedded applications. The function initializes the state machine, calculates the timing values (half_period and period_count), and sends them to the PIO. It then plays the beeping sequence for 5 seconds before entering an endless loop. The full source file and project are available on GitHub.async fn inner_main(_spawner: Spawner) -> Result<Never> { info!(“Hello, back_up!”); let hardware: Hardware<‘_> = Hardware::default(); let mut pio0 = hardware.pio0; let state_machine_frequency = embassy_rp::clocks::clk_sys_freq(); let mut back_up_state_machine = pio0.sm0; let buzzer_pio = pio0.common.make_pio_pin(hardware.buzzer); back_up_state_machine.set_pin_dirs(Direction::Out, &[&buzzer_pio]); back_up_state_machine.set_config(&{ let mut config = Config::default(); config.set_set_pins(&[&buzzer_pio]); // For set instruction let program_with_defines = pio_file!(“examples/backup.pio”); let program = pio0.common.load_program(&program_with_defines.program); config.use_program(&program, &[]); config }); back_up_state_machine.set_enable(true); let half_period = state_machine_frequency / 1000 / 2; let period_count = state_machine_frequency / (half_period * 2) / 2; info!( “Half period: {}, Period count: {}”, half_period, period_count ); back_up_state_machine.tx().wait_push(half_period).await; back_up_state_machine.tx().wait_push(period_count).await; Timer::after(Duration::from_millis(5000)).await; info!(“Disabling back_up_state_machine”); back_up_state_machine.set_enable(false); // run forever loop { Timer::after(Duration::from_secs(3_153_600_000)).await; // 100 years }}Here’s what happens when you run the program:https://medium.com/media/1bea567079143d6da85e57edc2eec35b/hrefAside 1: Running this yourselfThe simplest — but often frustrating — way to run Rust code on the Pico is to cross-compile it on your desktop and manually copy over the resulting files. A much better approach is to invest in a $12 Raspberry Pi Debug Probe and set up probe-rs on your desktop. With this setup, you can use cargo run to automatically compile on your desktop, copy to your Pico, and then start your code running. Even better, your Pico code can use info! statements to send messages back to your desktop, and you can perform interactive breakpoint debugging. For setup instructions, visit the probe-rs website.To hear sound, I connected a passive buzzer, a resistor, and a transistor to the Pico. For detailed wiring diagrams and a parts list, check out the passive buzzer instructions in the SunFounder’s Kepler Kit.Aside 2: If your only goal is to generate tones with the Pico, PIO isn’t necessary. MicroPython is fast enough to toggle pins directly, or you can use the Pico’s built-in pulse width modulation (PWM) feature.Alternative Endings to the Register Hunger GamesWe used four registers — two general and two special — to resolve the challenge. If this solution feels less than satisfying, here are alternative approaches to consider:Use Constants: Why make half_period, period_count, and silence_cycles variables at all? Hardcoding the constants “62,500,” “500,” and “62,500,000” could simplify the design. However, PIO constants have limitations, which we’ll explore in Wat 5.Pack Bits: Registers hold 32 bits. Do we really need two registers (2×32=64 bits) to store half_period and period_count? No. Storing 62,500 only requires 16 bits, and 500 requires 9 bits. We could pack these into a single register and use the out instruction to shift values into x and y. This approach would free up either osr or isr for other tasks, but only one at a time—the other register must hold the packed value.Slow Motion: In Rust with the Embassy framework, you can configure a PIO state machine to run at a slower frequency by setting its clock_divider. This allows the state machine to run as slow as ~1907 Hz. Running the state machine at a slower speed means that values like half_period can be smaller, potentially as small as 2. Small values are easier to hardcode as constants and more compactly bit-packed into registers.A Happy Ending to the Register Hunger GamesThe Register Hunger Games demanded strategic sacrifices and creative workarounds, but we emerged victorious by leveraging PIO’s special registers and clever looping structures. If the stakes had been higher, alternative techniques could have helped us adapt and survive.But victory in one arena doesn’t mean the challenges are over. In the next Wat, we face a new trial: PIO’s strict 32-instruction limit.Wat 2: The 32-Instruction Carry-On SuitcaseCongratulations! You’ve purchased a trip around the world for just $4. The catch? All your belongings must fit into a tiny carry-on suitcase. Likewise, PIO programs allow you to create incredible functionality, but every PIO program is limited to just 32 instructions.Wat! Only 32 instructions? That’s not much space to pack everything you need! But with clever planning, you can usually make it work.The RulesNo PIO program can be longer than 32 instructions.The wrap_target and wrap directives do not count.Labels do not count.A Pico 1 includes eight state machines, organized into two blocks of four. A Pico 2 includes twelve state machines, organized into three blocks of four. Each block shares 32 instruction slots. So, because all four state machines in a block draw from the same 32-instruction pool, if one machine’s program uses all 32 slots, there’s no space left for the other three.When Your Suitcase Won’t CloseIf your idea doesn’t fit in the PIO instruction slots, these packing tricks may help. (Disclaimer: I haven’t tried all of these myself.)Swap PIO Programs on the Fly:Instead of trying to cram everything into one program, consider swapping out programs mid-flight. Load only what you need, when you need it.Share Programs Across State Machines:Multiple state machines can run the same program at the same time. Each state machine can make the shared program behave differently based on an input value.Use Rust/Embassy’s exec_instr Command:Save space by offloading instructions to Rust. For example, you can execute initialization steps before enabling the state machine:let half_period = state_machine_frequency / 1000 / 2;back_up_state_machine.tx().push(half_period); // Using non-blocking push since FIFO is emptylet pull_block = pio_asm!(“pull block”).program.code[0];unsafe { back_up_state_machine.exec_instr(pull_block);}Use PIO’s exec commands:Within your state machine, you can dynamically execute instructions using PIO’s exec mechanism. For example, you can execute an instruction value stored in osr with out exec. Alternatively, you can use mov exec, x or mov exec, y to execute instructions directly from those registers.Offload to the Main Processors:If all else fails, move more of your program to the Pico’s larger dual processors — think of this as shipping your extra baggage to your destination separately. The Pico SDK (section 3.1.4) calls this “bit banging”.With your bags now packed, let’s join Dr. Dolittle’s search for a fabled creature.Bonus Wat 2.5: Dr. Dolittle’s PIO Pushmi-PullyuTwo readers pointed out an important PIO Wat that I missed — so here’s a bonus! When programming PIO, you’ll notice something peculiar:The PIO pull instruction receives values from TX FIFO (transmit buffer) and inputs them into the output shift register (osr). So, it inputs into output and transmits from receive.Likewise, the PIO push instruction outputs values from the input shift register (isr) and transmits them to the RX FIFO (receive buffer). So, it outputs from input and receives from transmit.Wat!? Like the two-headed Pushmi-Pullyu from the Dr. Dolittle stories, something seems backwards. But it starts to make sense when you realize PIO names most things from the host’s perspective (MicroPython, Rust, C/C++), not the point of view of the PIO program.This table summarizes the instructions, registers, and buffer names. (“FIFO” stands for first-in-first-out.)With the Pushmi-Pullyu in hand, we next move to the scene of a mystery.Wat 3: The pull noblock MysteryIn Wat 1, we programmed our audio hardware as a backup beeper. But that’s not what we need for our musical instrument. Instead, we want a PIO program that plays a given tone indefinitely — until it’s told to play a new one. The program should also wait silently when given a special “rest” tone.Resting until a new tone is provided is easy to program with pull block—we’ll explore the details below. Playing a tone at a specific frequency is also straightforward, building on the work we did in Wat 1.But how can we check for a new tone while continuing to play the current one? The answer lies in using “noblock” instead of “block” in pull noblock. Now, if there’s a new value, it will be loaded into osr, allowing the program to update seamlessly.Here’s where the mystery begins: what happens to osr if pull noblock is called and there’s no new value?I assumed it would keep its previous value. Wrong! Maybe it gets reset to 0? Wrong again! The surprising truth: it gets the value of x. Why? (No, not y — x.) Because the Pico SDK says so. Specifically, section 3.4.9.2 explains:A nonblocking PULL on an empty FIFO has the same effect as MOV OSR, X.Knowing how pull noblock works is important, but there’s a bigger lesson here. Treat the Pico SDK documentation like the back of a mystery novel. Don’t try to solve everything on your own—cheat! Skip to the “who done it” section, and in section 3.4, read the fine details for each command you use. Reading just a few paragraphs can save you hours of confusion.Aside: When even the SDK documentation feels unclear, turn to the RP2040 (Pico 1) and RP2350 (Pico 2) datasheets. These encyclopedias — 600 and 1,300 pages respectively — are like omnipotent narrators: they provide the ground truth.With this in mind, let’s look at a practical example. Below is the PIO program for playing tones and rests continuously. It uses pull block to wait for input during a rest and pull noblock to check for updates while playing a tone..program sound; Rest until a new tone is received.resting: pull block ; Wait for a new delay value mov x, osr ; Copy delay into X jmp !x resting ; If delay is zero, keep resting; Play the tone until a new delay is received..wrap_target ; Start of the main loop set pins, 1 ; Set the buzzer high voltage.high_voltage_loop: jmp x– high_voltage_loop ; Delay set pins, 0 ; Set the buzzer low voltage. mov x, osr ; Load the half period into X.low_voltage_loop: jmp x– low_voltage_loop ; Delay; Read any new delay value. If none, keep the current delay. mov x, osr ; set x, the default value for “pull(noblock)” pull noblock ; Read a new delay value or use the default.; If the new delay is zero, rest. Otherwise, continue playing the tone. mov x, osr ; Copy the delay into X. jmp !x resting ; If X is zero, rest..wrap ; Continue playing the sound.We’ll eventually use this PIO program in our theremin-like musical instrument. For now, let’s see the PIO program in action by playing a familiar melody. This demo uses “Twinkle, Twinkle, Little Star” to show how you can control a melody by feeding frequencies and durations to the state machine. With just this code (full file and project), you can make the Pico sing!const TWINKLE_TWINKLE: [(u32, u64, &str); 16] = [ // Bar 1 (262, 400, “Twin-“), // C (262, 400, “-kle”), // C (392, 400, “twin-“), // G (392, 400, “-kle”), // G (440, 400, “lit-“), // A (440, 400, “-tle”), // A (392, 800, “star”), // G (0, 400, “”), // rest // Bar 2 (349, 400, “How”), // F (349, 400, “I”), // F (330, 400, “won-“), // E (330, 400, “-der”), // E (294, 400, “what”), // D (294, 400, “you”), // D (262, 800, “are”), // C (0, 400, “”), // rest];async fn inner_main(_spawner: Spawner) -> Result<Never> { info!(“Hello, sound!”); let hardware: Hardware<‘_> = Hardware::default(); let mut pio0 = hardware.pio0; let state_machine_frequency = embassy_rp::clocks::clk_sys_freq(); let mut sound_state_machine = pio0.sm0; let buzzer_pio = pio0.common.make_pio_pin(hardware.buzzer); sound_state_machine.set_pin_dirs(Direction::Out, &[&buzzer_pio]); sound_state_machine.set_config(&{ let mut config = Config::default(); config.set_set_pins(&[&buzzer_pio]); // For set instruction let program_with_defines = pio_file!(“examples/sound.pio”); let program = pio0.common.load_program(&program_with_defines.program); config.use_program(&program, &[]); config }); sound_state_machine.set_enable(true); for (frequency, ms, lyrics) in TWINKLE_TWINKLE.iter() { if *frequency > 0 { let half_period = state_machine_frequency / frequency / 2; info!(“{} — Frequency: {}”, lyrics, frequency); // Send the half period to the PIO state machine sound_state_machine.tx().wait_push(half_period).await; Timer::after(Duration::from_millis(*ms)).await; // Wait as the tone plays sound_state_machine.tx().wait_push(0).await; // Stop the tone Timer::after(Duration::from_millis(50)).await; // Give a short pause between notes } else { sound_state_machine.tx().wait_push(0).await; // Play a silent rust Timer::after(Duration::from_millis(*ms + 50)).await; // Wait for the rest duration + a short pause } } info!(“Disabling sound_state_machine”); sound_state_machine.set_enable(false); // run forever loop { Timer::after(Duration::from_secs(3_153_600_000)).await; // 100 years }}Here’s what happens when you run the program:https://medium.com/media/b76f2dc0c295f64853de7608d763c3fb/hrefWe’ve solved one mystery, but there’s always another challenge lurking around the corner. In Wat 4, we’ll explore what happens when your smart hardware comes with a catch — it’s also very cheap.Wat 4: Smart, Cheap Hardware: An Emotional Roller CoasterWith sound working, we turn next to measuring the distance to the musician’s hand using the HC-SR04+ ultrasonic range finder. This small but powerful device is available for less than two dollars.HC-SR04+ Range Finder (Pen added for scale.)This little peripheral took me on an emotional roller coaster of “Wats!?”:Up: Amazingly, this $2 range finder includes its own microcontroller, making it smarter and easier to use.Down: Frustratingly, that same “smart” behavior is unintuitive.Up: Conveniently, the Pico can supply peripherals with either 3.3V or 5V power.Down: Unpredictably, many range finders are unreliable — or fail outright — at 3.3V, and they can damage your Pico at 5V.Up: Thankfully, both damaged range finders and Picos are inexpensive to replace, and a dual-voltage version of the range finder solved my problems.DetailsI initially assumed the range finder would set the Echo pin high when the echo returned. I was wrong.Instead, the range finder emits a pattern of 8 ultrasonic pulses at 40 kHz (think of it as a backup beeper for dogs). Immediately after, it sets Echo high. The Pico should then start measuring the time until Echo goes low, which signals that the sensor detected the pattern — or that it timed out.As for voltage, the documentation specifies the range finder operates at 5V. It seemed to work at 3.3V — until it didn’t. Around the same time, while my Pico kept working with Rust (via the Debug Probe and probe-rs), it stopped working with any of the MicroPython IDEs, which rely on a special USB protocol.So, at this point both the Pico and the range finder were damaged.After experimenting with various cables, USB drivers, programming languages, and even an older 5V-only range finder, I finally resolved the issue by:Continuing to use this Pico with Rust but switching to another Pico for MicroPython.Buying a new dual-voltage 3.3/5V range finder, still just $2 per piece.Wat 4: Lessons LearnedAs the roller coaster return to the station, I learned two key lessons. First, thanks to microcontrollers, even simple hardware can behave in non-intuitive ways that require careful reading of the documentation. Second, while this hardware is clever, it’s also inexpensive — and that means it is prone to failure. When it fails, take a deep breath, remember it’s only a few dollars, and replace it.Hardware quirks, however, are only part of the story. In Wat 5, in Part 2, we’ll shift our focus back to software: the PIO programming language itself. We’ll uncover a behavior so unexpected, it might leave you questioning everything you thought you knew about constants.Those are the first four Wats from programming the Pico PIO with MicroPython. You can find the code for the project on GitHub.In Part 2 (expected next week), we’ll explore Wats 5 through 9. These will cover inconstant constants, conditions through the looking glass, overshooting jumps, too many pins, and kludgy debugging. We’ll also unveil the code for the finished musical instrument.Follow me on Medium to get notified about this and future articles. I write on scientific programming in Rust and Python, machine learning, and statistics. I typically post one article a month.Nine Pico PIO Wats with Rust (Part 1) was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story. embedded-systems, rust, programming, raspberry-pi, software-engineering Towards Data Science – MediumRead More
Add to favorites
0 Comments