Friday, October 4, 2013

'Big Data' doesn't just mean increasing the font size.

Title taken from today's xkcd.

I don't know about you, but I work in an industry that's super excited about "Big Data" right now. I work in Education, but I get the impression that it's not just Education that's interested in Big Data. So, let's start with defining Big Data.

There's a lot of information out there. Most of it is really hard to come to collect and understand. That's why we have scientists carefully creating controlled environments and recording data. That's why we have police detectives collecting evidence. That's why we have auditors leafing through your receipts. But for a while now, we've been accumulating data in electronic form. All electronically collected data is more or less structured.

Usually structured data refers to something like a spreadsheet of numbers while unstructured data refers to something like tweets. But let's expand that out. Even your tweets are somewhat structured. They are made up of typed words, links, and hashtags that can be categorized, collected, and analyzed by a computer. Let's think of them as structured and instead, let's think of unstructured as referring to a behavioral psychologists noting monkey grooming rituals.

A tweet is a direct recording of when and where you tweeted, what you wanted to tweet, how your friends responded to it, etc. Notes on monkey grooming are subjective, possibly missing details and being recorded using inconsistent terminology. The point is, your tweets are ripe for analysis. As is most of your electronic data.

In Education, Big Data includes attendance records, test scores, disciplinary records, socioeconomic status, graduation rates, job placement, etc. In Finance it's everything from stock prices to product reviews to weather reports, to, yes, your tweets. In business it's HR records, budgets, market data, and much much more. I'm hard pressed to think of an industry which isn't collecting structured data.

And that's just the point. We're collecting data. We're accumulating it. It needs to go somewhere and it needs to be stored in a way that makes it easy to access, secure, and safe from degradation. The movers and shakers across most industries have realized this for a while now. That's why we have "data warehouses" and "cloud storage."

Right now is probably a good time to point out that "safe from degradation" is a problem of data collection as much as data storage. Remember when we were discussing how our monkey psychologists might record data using inconsistent terminology? Maybe today our scientist recorded the monkey having, "groomed," but yesterday said, "scratched," and the day before that our scientists said "itched." And maybe two months ago, the scientist switched from taking notes in outline form to using a narrative format. These problems are more common with unstructured data recording, but can also occur in more structured contexts.

Every time these changes happen, the data either needs to be "cleaned" or the analysis of the data needs to be sophisticated enough to gloss over these issues, like how you can ask Siri to, "wake me up tomorrow at 7" and Siri knows to "set an alarm for 7am on Tuesday." That's not exactly what you said, but Siri recognized the intent of your command.

Let's retrace our steps for a moment and identify the data requirements we've encountered so far:
1. Collect the data | Practice good data structuring
2. Store the data | Ensure data is safe, secure, clean

As part of storage, most industries have recognized that you need to access the data once it's stored. Accessing data should be easy and quick. If you need an advanced degree in computer science to pull a report and if it takes two weeks to create the report, it's not ideal. Why isn't it ideal? Because the data isn't useful if you have to call your IT guy every time you want to check on something. And usually the thing you want to check on can't wait two weeks. That brings us to requirements three:
3. Access the data | Ensure it's quick and easy

We've also hinted at requirement four:
4. Make the data actionable | ???

While I'm seeing a lot of collection and storage, I've just started to see my industry realize why they need to access the data. Making data actionable is actually really hard. This is in large part because to really understand data, you need to treat it like a scientist, like a mathematician, like an analyst. It requires identifying variables, setting constants, and running statistical analysis. Sadly, most industries and technologies have barely scraped the surface of this problem. I see a lot of requirements like:
- Automatically show the latest data
- Create bar graphs
- Create pie charts
- Update the graphs and pie charts

And I see very little:
- Run t-test
- Set p value
- Set threshold

Most problematically, I as industry leaders reach towards an understanding of the need for analysis, they jump right to the outcome, skipping how to get there. I see requirements for:
- Early warning indicators
- Graphing of benchmark test scores against high stakes test scores

Do you know what you can tell by graphing two different tests against each other? Very little. It reminds me of this comic. Let's add to number four:
4. Make the data actionable | Ensure analysis is statistically sound

One more time, all four requirements together:

1. Collect the data | Practice good data structuring
2. Store the data | Ensure data is safe, secure, clean

3. Access the data | Ensure it's quick and easy
4. Make the data actionable | Ensure analysis is statistically sound

I think we'll get there, but it might take another decade.

Friday, July 5, 2013

Enough.

Two years and two months ago I did a weight loss competition.  For 10 weeks, my female coworkers and I competed to become the biggest loser.  Each week we weighed in, keeping the weight anonomous but publicizing the losses, and sharing tips and tricks for both diet and excersise.  We helped each other make better lunch choices, and we confessed to each other when we splurged.  Some ladies even worked out together. 

In the end, I was the winner.

It was shocking for me.  I had never paid attention to my weight as a numeric value, and I rarely considered my fitness at all. I felt empowered by my success, but I didn't want to become fixated on a scale or a calorie counter app.  I have too many friends with eating disorders and I carry enough self doubt and stress as it is. 

26 months later, and I'm far heavier than I was even before that competition.  This is probably the heaviest or second heaviest I've been in my life.  I have to buy new clothes every few weeks as I gain more.  I don't want to look at myself in the shower.  I'm scared about my future. 

I've made sporadic attempts at diet and exercise over the past two years, but nothing which was backed by will power.  But two weeks ago, the day after my 26th birthday, as I attended by boyfriend's friend's wedding, I knew I needed to do something.

Even in a flattering dress with a new pair of heels and fashion forward up-do, I found myself hoping eyes would slide over me, remembering my personality but not my looks.  I wanted to hide.  I had enough.

Three days after the wedding, I started a trial membership at a local gym.  In the past 11 days I've gone 10 times.  I stopped going to Dunkin Donuts for a latte/muffin/bagel work day food fix. I stopped suggesting my boyfriend and I order pizza, instead suggesting we roast vegetables.  I can't tell if it's working.  11 days isn't 10 weeks, but I hope that I can achieve the kind of loss I saw before, and then keep it going. 

I have a hard time imagining myself still keeping up this pace in the cold, dark days of January, but I can see myself through to the end of the summer, maybe the end of the fall.  Perhaps by that point it will be habit.

I recently read Children of the Mind by Orson Scott Card.  He introduces an eastern philosophy where in a life of perfect simplicity is achieved not though a continuous monitoring of balance and harmony, but by recognizing the end of endurance and saying, "enough."  A practitioner stays a given course till he or she can take no more, and then shifts directions.  I think of it a bit like a Roomba, waiting until hitting a wall before adjusting course, but in the end, getting the whole room clean. 

That's what happened to me at the end of June.  Enough. I had traveled farther and farther into lethargy until I could take no more. And now I've changed direction.

I plan to take this as far as I can.  To the end of the summer? Until I fit into my clothes from two years ago?  Until I'm at a healthy BMI?  Until I feel comfortable in a two piece bathing suit?  Until I can run a marathon?  I don't know what my cut point will be.  I don't know when I will have had enough.

I'm at the beginning of this process.  With the full range of potential in front of me, I feel optimistic. 

Thursday, April 18, 2013

Interview vs. Ambition

During a job interview, should I lie about what I want to be doing?

It seems like there's an obvious answer.  Of course you shouldn't lie.  Interviews need to be polite, maybe reserved, but you should always be honest about your experience and your skills.  

But everyone you meet wants to fit the square peg in the round hole.  I walk into a interview and I know what they want from me.  They want me to be the job description.  They want me to be the corporate culture and all the other marketing-speak on their website, and they want me to figure out the subtext of our interview and the job description (examples: we need someone organized because the last person was a mess; we need someone with perfect grammar because these documents go to our clients; we need someone who's no nonsense because this role gets very little intrinsic respect).  

They're going to ask about why I'm looking for a job now and what I want to do in 5 years, but I'm not sure they care.  More than that, I think they might prefer not to know.  

It's a dichotomy:  On the one side, most companies aren't interested in hiring someone who has no ambition.  On the other side, no one wants to hire someone who walks in thinking their role is a stepping stone to something more.  Or do they?

I'd hope a good employer would want the person with crazy strong ambition.  As long as that person is also loyal, wouldn't it to be great to grow a customer support rep into a vice president or an administrative assistant into a director?  I've seem team leads say they would be interested in that, but I don't know that it's true.  

I do know that many companies prefer pigeonholing.  They hire you to do X.  Maybe in 5 years you can move to Senior X.  Maybe even X Team Lead.  But what if you strive for the sort of dynamic career that the most interesting and successful people have?  I am starting to feel like no one wants to hire that person.