IRQ sharing, the PCI bus, ACPI/APIC addressing and your Scope & Xite cards

An area for people to discuss Scope related problems, issues, etc.

Moderators: valis, garyb

Post Reply
User avatar
valis
Posts: 7306
Joined: Sun Sep 23, 2001 4:00 pm
Location: West Coast USA
Contact:

IRQ sharing, the PCI bus, ACPI/APIC addressing and your Scope & Xite cards

Post by valis »

What is the purpose of interrupts? IRQ's?

Say we want to talk to a peripheral like the network card as soon as a network packet arrives.

Before the interrupt was created, you had to know the timing of that hardware buffer exactly, and you would time your main software loop so that when it came back around to the function that polled the hardware it did it at the point where that peripheral was 'ready to go'. In other words, you would continuously ask the network card "Has my packet arrived?" and waste your processor time by talking to that device at the same point every time through the software loop. In fact, much of the computer's software in that era spent much of its time in that portion of the overall program loop. Polling input and output devices and making sure that the screen was written to, the keyboard buffer was read, and so on. It's also worth noting that as computers scaled up in speed & complexity, this would have cost more & more CPU time, as the wait states for those devices vastly exceed what the computer can do on internal loops.

Note: The upshot of this era is that computers, when working properly, were rock solid in terms of timing for midi as it would come around to the service the midi device at the same point in its main loop every time (and audio, though that was limited in its capability in computers at that time). We can call this the era of the 'Realtime Operating System'...

Now, it's important to keep in mind that on the low level in this era we only have a limited number of addressing lines to the CPU. Ie, the CPU is directly connected to devices via copper traces, and so you had to add more addressing lines to the CPU to increase onboard device counts. This meant that the CPU either grew in complexity as devices were added onboard, or that computers were very limited in the roles that they could perform and when you wanted to have more peripherals to cover 'more roles', a new computer was needed, as the number of peripherals each machine could support (at a similar pricepoint) was clearly quite limited.

In any case this was the era where we also transitioned into 'Preemptive Multitasking' and since things would be 'interrupted' to ensure all tasks and devices got their 'share of the time' on a CPU, timing issues grew greater.

This is of course when IRQs came along, which means "Interrupt ReQuest", so the external hardware (to the CPU/Memory layer) gets "interrupted" by an IRQ call when the network card is ready. This signal is actually seen by the CPU on its INTR line, of which there is just one. What should we do if there are a lot of external devices? Well, we would require making a TON of INTR pins on the CPU for all of them..

To solve this problem a special chip was created: an interrupt controller or PIC, which connects multiple devices to a single INTR line. This first generation Interrupt Controller Chip (8259 PIC) had 8 input lines (IRQ's 0-7) and 1 output line (which connects the PIC with the INTR line of the CPU). When there's an interrupt call from a peripheral, the PIC will signal the CPU over the INTR line, and so the CPU knows the PIC chip needs its attention. The main software loop will then come to the point where it MAY service the PIC (or handle the 'interrupt request'), and so as soon as it's able to it will 'interrupt' what it's doing and ask the PIC which of the 8 input lines was the source of the interrupt. That adds a tiny bit of overhead compared to polling each device directly, but now we have 8 lines connecting to the 1 INTR lane for the cpu.

Let's break this down in the context of our network card example:

Rather than the CPU polling the network card every time through its loop from a physical connection made directly to the CPU from that device on a copper trace to a pin, we now have a card that is cascaded through a separate chip (or function on a chip) along with several other devices at the same time. So after a network packet is received, the network card will make a signal over the line that connects to that IRQ controller chip to say "I'm ready!". This IRQ controller chip then connects to the CPU via that INTR line and says "hey, let me interrupt you!". The CPU will sense this INTR signal and know that the network card has information for it. Only after that the CPU will read the incoming packet directly from the network card, most likely to buffer that data for later internal use.


Oh, also, our Network card is now sharing the INTR line with 8 other devices, right? Clearly, 8 lines weren't enough for too long. So two 8259 controllers (a master and slave) were connect in a cascade, which gave us 16 address ranges. IRQs 0-7 were processed with the master, and 8 to 15 with the slave. Only the master is connected to the CPU and can signal for interrrupts, and the slave had to use IRQ to tell the master that it had an interrupt request ready to handle. Technically that meant losing 1 address line, but that made for a total of 15 interrupts for all devices.

So now we have all 15 interrupts we know from a pre-ACPI system. But when the PCI bus replaced the ISA bus, the number of devices on motherboards again began to exceed the IRQ lines available (more than 15).

So interrupt sharing started, and several devices started sharing the same IRQ. The "IRQ" technically became known as a PIRQ (or *programmable* interrupt request). When a PIRQ is called, the processor now not only stops and talks to that IRQ line, but asks which device connected was responsible for the request as well, and there's a devic that is involved THERE called a PIR router that connects to the PCI interrupt lines to service the PIRQ (another chip is cascaded again!) By the way, this PIRQ also introduced "IRQ Steering", where the internal table of peripherals could be remapped to different IRQ 'pins' as exposed to the original IRQ call's way of thinking.

Now, when this is done we had to separate out the ISA lines completely as well. So this is why they show up mapped to other IRQ's than the PCI bus devices use. ISA level devices cannot be dynamically mapped or 'steered', and so would have an error if connected to the PIR chip.

This whole discussion is oversimplified btw, since the real world each PCI device has 4 interrupt lines (INTA/B/C/D) and up to 8 functions, where each function can only have one INTx interrupt. Which INTx line is used by each function is determined by the chipset...and etc. Layers of complexity are present now that were not present on an original ISA (16bit) based system, which PCI (32bit) is more or less 'extending from'.

Or just to be accurate enough for those who need it: the main takeaway is that information about a PIC controller interrupt routing is sent to the OS by the BIOS, with the help of a table $PIR and through the registers 3Ch (INT_LN Interrupt Line (R/W) and 3Dh (INT_PN Interrupt Pin (RO}} of the PCI configuration space for each function. This table can be understood by referring to the PCI BIOS Specification (4.2.2 Get PCI Interrupt Routing Options), or in our case, by looking at the BIOS IRQ map screen that we used to see on computers during BOOT POST screens prior to about 2009. Not to mention we have migrated to EFI... Complexity keeps entering the picture, doesn't it....

(Oh, and all of this worked as long as there was only ONE processor. While there is still a way for multiple CPU's to share the PIC requests, this was prone to error. And all machines were becoming multicore soon anyway we now know, but at the time multiple sockets were the only way to scale cpu power within a single system.)

A new solution was found that worked better called the APIC Interface (Advanced PIC). (And for multi-processor sockets machines, a special controller called LAPIC (Local APIC) was added for each processor, as well as the I/O APIC controller for routing interrupts from external devices. When an external interrupt arrives on the I/O APIC input, the controller will send an interrupt message to the LAPIC of one of the system CPUs. In this way the I/O APIC controller helps balance interrupt load between processors.)

As in the PIC case before it, separate chips in the beginning became part of the chipset later. These chips evolved over time so there's actually several versions of the APIC controller. This had its downsides as complexity increased, all of the interrupt lines from devices made the system very complicated and thus increased error probability.

And then on modern machines The PCI express bus came to replace the PCI bus, which simplified all interrupt systems completely. It doesn't have interrupt lines at all. For backwards compatibility interrupt signals (INTx#) are emulated with a separate kind of messages. With PCI interrupt lines their connection was made with physical wires. With PCI express interrupt lines a connection is logical and is made by PCI express bridges. But this support of legacy INTx interrupts only exists for backwards compatibility with the PCI bus. PCI express introduces a completely new method of interrupt delivery — MSI (Message Signaled Interrupts). In this method a device signals about the interrupt simply by writing to a special place in the MMIO region of the CPUs LAPIC.

So our Scope cards are PCI device specifically, but in the context of a system that has evolved onward but still has the ability to support them with backwards compatibility to that standard implemented as part of the modern PCIe bridge. And that our cards came along somewhere between "Plug and Play" maturing (Win 98SE) and the APIC controllers taking over our computers. So the methods we used then to troubleshoot and resolve problems became sort of the fundamental understanding of 'how to make Scope cards work'. This ecosystem has evolved enough over time it's important to understand what an IRQ is more or less has changed fundamentally by now as well (with PCIe controllers running everything in the CPU & Chipset).

Backing up again for a moment... Plug & Play basically meant that you could go into the BIOS and change certain parameters, move cards to the right slot, and windows would "figure it out" and set the driver to the proper settings (if present on the system already) and enable the device. In some cases, you could even select the driver properties in device manager and switch settings and the device would 'see' this change from the driver (oh, there was an era when that corresponded to on-device jumper settings before this btw). So swapping cards around and disabling onboard devices made a ton of sense.

And then of course APIC controllers came, roughly around the point our second generation cards did in fact. Remember the early era PIC controller pair maps to the physical PCI slots and gives us 15 IRQs, well it was extended with another controller at the end, and disabling it at this point (switching from ACPI Computer to Standard Computer, or in a multicore system MPS Multiprocessing Computer) was enough to get back to the cascaded PIC implementation, as the newer APIC controller would simply drop down to emulating that. This imho is why that fix then became the standard!

But fast forard to today and, we have gone beyond mapping everything to a few IRQS via the APIC controller & ACPI stack to where it's all emulated insode of a PCI express bridge, and everything is dynamic! Certainly we are probably ill advised to try changing the HAL on an 8th gen Core i7 system running Windows 10, and to be honest I don't even do it on my 2009 era rigs running Win7. "Standard PC" and "MPS Multiprocessor PC" are no longer viable 'fixes', and so we fell back to talking about "IRQ" sharing again...

Make sense?

Well no, because those "IRQ's", as we have said, are 100% virtualized and dynamic! It's a much larger lookup table inside of the PCI Express Bridge, and since the Scope cards are typically going to be connected to the 'chipset' (rather than a bridge extended off the CPU lanes directly) it's the devices on that chipset I would look to first. However if a native PCIe device to the CPU is hogging the system that surely will be even worse for the overall system's performance. It's just that our Scope cards tend to be the most sensitive to this, for as GaryB has often said, they are very sensitive to not having the data they need in time.

----------------------------

I'll try to break down the APIC changes over time real fast, since we have a variety of systems out there this may be useful to some:

------------

On a machine from 2001-5, IRQ sharing and allocation is actually going to be THROUGH an APIC controller chip, and the lines in question will still correspond to physical traces on the motherboard. If we disable the ACPI in bios & windows, we can resolve this sharing because you can SEE the lines being shared (by setting to Standard Computer or MPS Multiprocessor Computer setting in Windows, then check Device Manager).

On a machine from 2008, where there's only 2 GPU slots and 1 more PCIe 1x or 4x slot, those extra PCI slots probably connect to the 'southbridge' via an APIC controller implemented as part of the south bride chip, and the PCI Express 'bridge' is implemented directly on the 'northbridge' for the GPU's. Any additional PCIe lanes came from a 3rd party PCIe Bridge Chip. Note again, that the southbridge DIRECTLY implements the PCI lanes, and thus *is* the APIC controller functionality as well, and so any devices connected to that bridge may impact overall performance, not just shared IRQs. We can still however (if we REALLY want to) probably change the computer type in Windows, disable any EFI emulation (if present, it was starting to emerge then) and disable ACPI in the BIOS. However for a machine from this era this is likely to be unwise, as they were multicore enough that the non-ACPI implementations may have worse timing than ACPI just due to the hardware drivers for onboard devices & peripherals being developed largely for ACPI use by this point.

Fast forward to now, and there's no longer a 'northbridge' chip at all, as CPU's implement a PCIe bridge stack internally and talk to both GPU's and a few added lanes which get split between the chipset (which is what the 'southbridge' has become) and other devices. In other words, the PCI slots you're connecting to are being emulated by the chipset and/or CPU 100%.

----------------------------

So the final takeaway: the real issue here on a modern system happens when a peripheral being served doesn't doesn't release the CPU's INTR line in time, and perhaps other devices have queued up to be polled as well, and the Scope card doesn't get communicated with in time. This is what has NOT changed. What *has* changed is that those 'interrupt lines' are no longer physical lines or even the same addresses that they were historically, however what exists now was built in such a way as to ensure backwards compatibility with peripherals and software that wants to address things that way (hence all the ISA addresses you'll see in device manager when you do View > Resources by Type, and expand the Interrupt Request (IRQ) section).

So ensuring that we share with *nothing* is unlikely in a modern system. The numbers you 'see' as sharing aren't sharing copper traces in the way they once did anyway, not for many years. More important is to measure our devices in a system (like with the DPC latency measurements) and see what the worst offenders are. It would make sense not to share with them (like the GPU which can vary for DPC performance even over various driver releases). But in some cases it may not even be a device that's actively sharing an IRQ, simply one that is not playing well with the whole system and what the Scope card's needs are in that system.

I looped over this final point 3 times here, so I might edit this a bit later for brevity. But hopefully this whole piece makes sense not just in terms of how our 'PCI slot sharing' and 'IRQ sharing' situation has changed over time, but why certain 'fixes' that worked for one generation may or may not make sense--at least in the same way--on a different generation of machine.
User avatar
t_tangent
Posts: 970
Joined: Sun Dec 28, 2003 4:00 pm
Location: UK

Re: IRQ sharing, the PCI bus, ACPI/APIC addressing and your Scope cards

Post by t_tangent »

Nice post Valis, you should sticky this
User avatar
valis
Posts: 7306
Joined: Sun Sep 23, 2001 4:00 pm
Location: West Coast USA
Contact:

Re: IRQ sharing, the PCI bus, ACPI/APIC addressing and your Scope cards

Post by valis »

Done :)
Post Reply