Woman vs. Man vs. Machine: 2013

Monday, March 25, 2013

Finding a Majority to Be In

I think a lot about being the only woman. I talk a lot about it. It's one of the first things I notice when I work with a new team. How many women are there? How many men? What's the precise ratio? No, seriously, I spend time in meetings with new teams or with aggregations of teams reducing fractions to find the exact ratio of men to women, what the percentage is, how much of the percentage I make up. Partly that's because, well, it's a meeting, and as a coop, I don't get a merit increase, so I could care less that they'll be calculated by next Wednesday EOD, but mostly it's because I focus so heavily on gender in the workplace.

And that's wrong!

I shouldn't do that! It's sexist!

Because gender doesn't matter in a professional environment. It doesn't define what I can do. Just because I am the minority, a member of the group that has been wronged in the past, doesn't mean that that's all I am. That's not what's most important about me. What's important about me is that I'm a programmer, and I like low-level code, and I have experience with higher-level code, and I like skiing and hockey. And tons more. In fact, just about everything is more important than my chromosome set. And even though I have a different chromosome set than the other 14 humans on my team, there's plenty of other stuff that I have in common.

So I am challenging myself. Instead of counting gender ratios, I want to count numbers of majorities I am a part of. I want to take note that I'm a member of the software developer majority. The engineering background majority. The hockey-watching majority. The video-game-playing majority.

Because when gender matters more than all that other stuff, that's sexism, period. When any attribute - attractiveness, race, sexuality, nationality, whatever - that you didn't opt into matters more than all the other stuff that you did opt into, that's bigotry. In the workplace, on TV, dating, at school, anywhere.

So, let's quit it. The American dream is that people can be whatever they want to be, right? That people can choose who they are. And I don't think America is too different from the rest of the world, not any more. None of us have an excuse to focus on something we can't help unless someone else did it first, and then only to find equality and fairness.

Friday, February 15, 2013

sscanf() Gotcha

Interesting gotcha I ran across using sscanf() today.

Here's the example scenario. In my case, I was parsing a .mcs file and looking for pairs of characters that were hex representations of bytes of data. That is, the string "FF001122" would translate to data = {0xFF, 0x00, 0x11, 0x22}.

fstream stream;
stream.open(fileLoc);

char data[8] = {};
char line[16];
stream.getline( line, 16 );

for (int i=0; i<8; i++) {
sscanf( line + (i*2), "%2x", &(data[i]) );
}

Upon program exit, I get a stack corruption error around 'data'. Weird, right? We don't go out of bounds at any point on data, right?

Well, we don't... But sscanf() is a different story.

Let's step through the for loop. We'll say line = "0011223344556677".

i=0.
data = { ?, ?, ?, ?, ?, ?, ?, ?}
Perform the sscanf.
data = { 00, 00, 00, 00, ?, ?, ?, ? }

See the problem yet?
We'll keep going:

i=1.
data = {00, 00, 00, 00, ?, ?, ?, ?}
Perform the sscanf.
data = {00, 11, 00, 00, 00, ?, ?, ?}

At the end, we have:

i=7.
data = {00, 11, 22, 33, 44, 55, 66, 77} 00, 00, 00 (out of bounds!!!)

What's going on here is that sscanf() doesn't realize that data is an array of chars. It assumes that since the "%x" identifier was used, we want it to give us unsigned ints. But it doesn't check, or ask. Turns out there's a way to specify manually - the correct form to receive data back in char size (and thus write out of bounds) is to use the identifier string "%2hhx" - that is, 2 hex chars output into one byte of data (specified by the "hh"). If you look carefully, the scanf formatting table in the spec tells you that this is expected behavior, but it's not particularly outspoken about it.

So, keep an eye out, from your good friend Emily, wasting four hours of her day so you don't have to.

EDIT:
Turns out this doesn't quite work depending on your architecture and whether or not you're using Unicode.... I ran this same code in Unicode and the 'hh' format string doesn't help - sscanf still reads into 4 bytes every time. Internet searches show that the best solution is to use a temporary variable. Barf.