CAN Basics (bit-wise arbitration, differential signals)
The CAN Basics Training Course provides a practical approach to understanding how CAN works. By giving real world examples, common practices, and an in-depth look at DBC files, Bryan Hennessy gives a real-world walkthrough of CAN.
Presentation by Bryan Hennessy. Recorded as part of a ‘live’ training session in January 2019.
Video Transcript:
Bryan Hennessy: [00:00:02] So we’re going to jump into bit-wise arbitration. Bit-wise, when I first studied NMEA 2000 in, I think it was 2003 or 2004, I’d just got into marine industry, bit-wise arbitration was really what I studied. I fell in love with it because it solves so many problems that I’d encountered for the previous 20 years of my career doing digital communications and machine communications. It just made sense. Just I read it and I thought, “Wow, why didn’t I think of this? This is so simple.” So basic, yet so powerful this bit-wise arbitration thing that I just got to understand it and use it more. And that’s when I just kind of shifted my whole career focus over toward CAN.
So, when you talk about differential signaling, anytime you’re measuring a voltage or looking at a voltage on a wire, you got to measure it with some reference. It’s going to be measured against some other reference. Otherwise, it doesn’t [00:01:00] mean anything. So it’s just like measuring a height. Height above what? It’s got to be above something. You got to have a reference point.
So engineers, electrical engineers certainly understand this very well. So this is a representation of the CAN_Low and the CAN_High signals and the voltages on those signals. I need to add a few things to this slide but I’ll talk through them this time. So we have different timeslots, time one, time two, time three here. So during time one, what’s on CAN_High is about 2.5 volts, what’s on CAN_Low is about 2.5 volts as well. This is called a 1, or a recessive bit, or transmitting or recessive state. The names are important here because they mean something.
In timeslot two, and this is time going from left to right as it is in an oscilloscope or many different diagrams, in timeslot two, we have a dominant bit. [00:02:00] We have about 3.5 volts on CAN_ High and about 1.5 volts on CAN_Low. Now these are voltages out of a transmitter of course. As you get down the network, the voltages could be much lower. But the point with differential signaling, or differential signaling simply means measuring the receivers measuring the difference between these two signals, and they can see that there’s very little to no difference, or there’s a pretty substantial difference. That difference on the receiver certainly may not be 2 volts as it is here, the difference may be two/tenths of a volt but it’s still going to receive it because with differential signaling the signal travels a long way on the two wires and it’s fairly easy to discern or for the receiver to take off a wire and understand. So we can reliably get a 0 across the network and we can reliably get a 1 across the network when we want to [00:03:00] in a fairly large network with differential signaling.
A lot of engineers will say differential signaling is more reliable than single-ended signaling where you’re measuring one wire with a reference to a ground. So the voltage is only changing on one of your transmitting wires, one of your communication wires. Ground is the other wire. A lot of people say that’s not going to transmit as far because you’re, in essence, sending energy down the line. In differential signaling, the average energy that you’re sending down the line should really be zero because it balances out from the 0’s to 1’s. So it really shouldn’t be sending any energy overall down the wire, which means there should be less resistance, less drop of the signal. So, I don’t know how true that is but that’s what a lot of engineers say.
Unidentified Man: I have a quick question.
Bryan Hennessy: Yes.
Unidentified Man: When the tolerance is on the measurements, [00:04:00] I guess this is set by the different standards on to top of CAN.
Bryan Hennessy: Yes.
Unidentified Man: I guess on the use cases as well depending on how far you’re sending the signal.
Bryan Hennessy: Yes. So I was going to get into that a little bit. Well, I guess I kind of did. The drop cable distance, the baud rate may be different in different industries. CANopen, for example, you can specify all kinds of different baud rates. It’s a balance and it’s set during the designer system in most cases. As far as the tolerances are concerned, that’s set by the CAN standard and that’s built into the receivers and the transmitters, mainly the receivers as well. Every receiver is a transmitter in CAN. I actually have a diagram of an example of a transceiver circuit on the bus coming up.
I don’t know the exact tolerances, I’d have to reference back to the specification. Point being [00:05:00] that there are some threshold, we call it in engineering, where the receiver is going to say that’s either a zero or so that’s either a one or possibly that’s a no man’s land, and I can’t really determine it.
Unidentified Woman: I can say when I worked at the Harvard department, I worked a lot with test rest for all products. So every positive product is tested after production and we try to stress the canvas as much as we can. I can say it’s really, really hard to make it not work. At least with regular CAN. Haven’t worked with CAN FD yet, but regular CAN it is really heard. I looked with oscilloscope on the signal. I couldn’t even see it was a CAN signal. I mean it was so messed up but it worked anyway. So, if you have a network and you get a narrow frame or something, when you look at it you probably won’t even [00:06:00] see that it’s a CAN channel.
So, if everything is good, it looks like this. But you can do a lot of things to the CAN network because they work.
Bryan Hennessy: I’ve heard a lot of people say CAN is too reliable because you can produce a CAN network and if you don’t really take the right steps to look at it and validate and make sure everything’s working right, it’s going to work but it may be all wrong. You may have a wrong terminator on it, or a wrong value, or the wiring may be completely wrong from what the standard says. But hey, it works, so let’s produce a million of them and ship them out the door. Then you suddenly find that, “Oh boy, I was too close to the edge on that and it’s not right and not all of them are working here.” There’s a lot of applications in the production world as many of you, I’m sure, know where a 2% failure rate’s going to put you out of business, and you better get it right so you can get a [00:07:00] 99.9% proper working rate, and a very, very small failure rate.
But yes, that’s the whole thing about CAN being almost too reliable. It works in a lot of situations where it just shouldn’t. A lot of that’s because of differential signaling and bit-wise arbitration.
So we understand what the signal looks like at the transmitter at any given CAN transceiver or transmitter chip. Kvaser doesn’t make transmitter chips. We don’t make silicon. We use silicon from TI, or from microchip, or from one of the other silicon suppliers to do this. But they’re, hopefully, all the silicon, and we have other silicon designers in the room, I guess, all the silicon is designed to meet the standard properly and be a proper transceiver for the CAN standard.
So this is a very important slide. This slide, for understanding bit-wise arbitration this is the slide. [00:08:00] The slide shows two nodes on a network. The theory behind what I’m going to explain here could be 2, 3, 4, 50, 100 and sub cases. I usually say 50 because NMEA 2000 specs stops at 50 and they say that’s the maximum. But other networks, I’m sure there are more than 50 and some other applications with CAN on any given network. Theoretically, there’s a maximum. I don’t really know what it is. It would depend on the bit rate and other things, but we won’t worry about the maximum. But it’s way up there.
So with two nodes on a network, you got four timeslots here from left to right. Then you have what’s actually on the bus. Now you got to remember the CAN network I showed you. Every node is hooked up to the same two wires, so there’s only two wires on the network. So there’s only two real voltages that can be on, one on each wire, at any given point in time.
So in timeslot one over there, [00:09:01] Node A is transmitting a 1 and Node B is transmitting a 1. Whether we say a 1 is a recessive or dominant, we said it’s a recessive bit. So the CAN bus, the two wires that make up the bus will float. If nothing on the bus is transmitting and it’s just idle, it’ll be in this state, it’ll be in the recessive state.
I should back up a little bit. One of the things you wrestle with in any data communication scheme is mitigation of collisions. If two devices are wanting to transmit at the same time, how do you deal with it? How does the system deal with it? In Ethernet network with a bunch of devices on them and two of them starts a blast data at the same time, it causes a voltage spike and the receivers sense that spike and they back off because they know there’s been a collision. The Ethernet standard actually says they wait [00:10:00] a random amount of time and try again to transmit their data and hope there’s not a collision.
Now in a busy Ethernet network this isn’t a good thing. Random amount of time isn’t a good thing. CAN is very deterministic. When you got to send a high priority information for a control system, like hit the brakes in my car because I need to stop, that’s a pretty important message you want to get across your CAN system, you don’t want our random amount of time anywhere in the equation. You want set deterministic amounts of time, and that’s where CAN comes in and it’s one of the big advantages of CAN over Ethernet. This is why, is because of bit-wise arbitration and prioritization of messages.
So, again, back to the timeslot A, both Node A and Node B are transmitting a recessive state. Timeslot B comes along and both Node A and Node B say I got data to transmit. I’m going to transmit a dominant state, which happens to be a start of frame bit. So this would be a start of frame bit if nobody had the bus [00:11:00] and nobody was transmitting during timeslot 1. Timeslot 2 is a start of frame bit and both Node A and Node B, they got a frame to transmit. We got data. We’re going to send it.
So as you’ll see later, timeslot 3 would be the first of three priority bits. So timeslot 3 comes along and that’s where things get interesting. Node A says I’m going to transmit a 1 and Node B says I’m going to transmit a 0, because my priority here is, well, this would be a higher priority than this one. I’ll explain that later. But he’s just transmitting 0, he’s transmitting a 1.
So what’s going to appear on the bus at this point? Well, of course here 1’s going to appear on the bus because they’re both transmitting it. Here, a 0 is going to appear on the bus because they’re both sending that signal as well. But here you got one sending a 1 and one sending a 0. This is a recessive and this is a dominant. [00:12:00] What appears on the bus – I’ll show you electrically why later – is a 0. Every time a CAN transceiver transmits something on a bus, it reads back what’s actually on that bus. So it’s receiving circuit. And if it doesn’t match what it’s transmitting, it knows something. What it knows is, “Hey, there’s somebody else transmitting something on this bus and they’re a higher priority to me, so I better quit transmitting.”
So at this point in time, right here, Node A is going to say I’m sending a 1 and I’m reading a 0, there’s a higher priority device transmitting. I’m going to stop transmitting. So from this point on, Node A simply leaves the bus in a recessive state so it doesn’t drive the bus at all. Node B has one arbitration at that point. That’s as simple as bit-wise arbitration is. That’s all it is.
[00:13:01] Timeslot 4, they’re both going back to recessive and of course a recessive is going to appear on the bus. So, any questions on that?
So that’s an important concept. For one, that tells you once you understand this when I tell you that a priority 0 is better than a priority 1, 2 or 3, that the lower number on the bus and priority is three bits in J1939 scheme or an NMEA 2000 anyway, you fully understand why. Say, “Hey, I understand why 0 is higher priority than 7 or 6, or any other number because 0 is three dominant bits as opposed to any recessive bits.
Unidentified Man: So is that the same as the address in [00:14:00] the CAN network?
Bryan Hennessy: That is a good question because that is determined by the data link layer, which is what we’re going to talk about next when we get over the physical layer. Sometimes, it’s the address. Sometimes, it’s the priority. It’s different in different industries that use CAN. That can be defined differently depending on what industry you’re in. That could be used differently. But it’s still bit-wise arbitration on the physical layer.
This, I grabbed this from a spec sheet on an actual CAN transceiver. Maybe a little more complicated than I wanted, but it gets the message across. So this is where your CAN microcontroller is sitting on the left and this is your CAN_Low and CAN_High out here on the right. So these are your two wires actually on the CAN bus.
So the only thing what I really wanted you to get out of this is that, so these are just TTL signals coming in here. [00:15:00] So, transmitting something. Then there’s this assert signal that turns the transmitter on. So if this transmitter is in a recessive state, it’s not transmitting something or it’s transmitting a 1, the search signal when you’re not transmitting anything is going to be off. So basically to understand what voltages these are at or what voltage this device is driving these lines to, you just take that out of the circuit. If you take that out of the circuit, you just got a resistor divider here, and the resistors are the same value, so the voltage on both of these lines is going to be the same because they’re both resistor dividers, right here. The voltages are just going to be a mid voltage halfway between VCC and ground.
So if you think back to the diagram I showed you originally about differential signaling, [00:16:00] these are just floating at a mid voltage. When you want to drive something and an active signal comes on to TX and you’ve turned on this driver with the assert signal, then you’re going to drive them to whatever state the driver’s designed to drive them to, and the resistors will allow you to pull the signals to that voltage, and then you’re driving to a dominant state. So the processor on the other end is making all the decisions about whether to drive this and when to drive this. The feedback is coming right here. So it’s going to read through the receive line to make sure it’s the same thing as transmitting. So the processor has complete control over the network.
Unidentified Man: Do you think it needs to share a common ground?
Bryan Hennessy: In CAN, you don’t in all cases. Most devices do. Most systems do. In NMEA 2000, they call it shield. They shield the cable but it doesn’t have to. [00:17:01] Two CAN devices are certainly going to talk and communicate with each other without a common ground. I have right here, I don’t have a common ground on this network. That’s the other neat thing about differential signaling, is you don’t need a common ground.
So, some uses of CAN do specify a common ground for basically for noise abatement, for noise in the system but it’s not necessary in differential signaling.
This is stolen out of J1939 specification. This is actually stolen out of a J1939-84 specification that talks about diagnostic tools and how they plug in. I just use this as a representative diagram of some of the things that different industries have to set when they’re using CAN, such as the trunk cable length, the maximum trunk cable length, the drop cable lengths here, the length from the resistor [00:18:00] to the first drop cable and these. So different specs will say these are different maximums, minimums, depending on how fast they’re going, what wire they are using, is it twisted pair or not, what gauge it is, what insulation it is, that’s all stuff that can vary in different industries. You’ll see numbers again. This is an example pulled out of J1939, but you’ll see different numbers for these different values in different industries.