400 data points
The speakers were highly critical of the fact that CCS contains a data model with over 400 data points. I think most people do not even know what a data model is. Most do not work in IT and don't know what data points are. Complaining about "400 data points" (oh my!) sounds pretty ignorant to someone who works with data. Here is why:
A data point is just a fact. Your first name is a data point. Your middle name is another. Your last name is a third. Do you go by Mr. or Mrs or Ms.? Now we're up to four. Are you junior or senior or III or none of those. There are 5 data points just in your name.
Then we come to your address -- there goes another 6 "data points". Add in phone number(s) and email address(es) and you get even more. The number of data points is nothing to be scared about.
If you weigh yourself every day for 365 days, you will end up with 365 data points, but only one type of fact (your weight) is being collected. So it is wrong to use the term "data point"
To be precise, what people are really worried about are the attributes of each student that will be tracked in the database. The key word is attribute, not data point. Here is an example. My name is Robert Lawrence White. That is three data points. It's public information now. The attributes are (1) First_name (2) Middle_name and (3) Last_name.
When we complain about voting_status, we are not complaining about the data point, we are complaining about the attribute. Why is this important? Because if you don't want to look foolish, you will want to use the right terminology (saying "data point" when you mean "attribute" is like saying "doohickey" when you mean "rotator cuff").
Also, if you want to look up the actual information that is proposed to be gathered, you will want to know what to look for. If you look for "400 data points", you will most likely only find more rumors being spread about CCS and not the original source material.
The Data Model
The source material, by the way, is here:
That shows the attributes (and relationships) that are proposed to be kept on elementary and secondary students. There are 416 attributes listed there. That is probably where the "400 data points" claim arises from. But there are a number of reasons not to panic about this number of attributes:
- They are redundant. Social_security_number is listed twice. Middle_name and middle_initial are both listed.
- It is non-normalized (a technical term meaning that a single piece of information is listed in more than one way). For example, race is one coded attribute (with 5 possible codes: white, black, asian, etc). But it is also listed as: White (true/false) and Black (true/false) and Asian (true/false) etc., because a person can be both white and black (e.g. Obama) or black and Asian (e.g. Tiger Woods).
- It is comprehensive. All the attributes do not apply to every student. It is very unlikely that a student would have an accurate, up to date, non-null answer for every single attribute -- extremely unlikely.
This last point is the main reason people should calm down about this data model. This point is critical.
Yes, the data model asks what the student's voting_status is. This is defined as: An indication as to whether an individual is registered to vote in public elections. The possible answers are Registered, Not Registered and Not Eligible. Almost every elementary and secondary student will be listed as Not Eligible. How this information could possibly be kept up to date is beyond me. You may recall that the gov't Voter Registration bureaus do not keep this info up to date, how the hell is the school going to do it? The answer is, they simply are not. This data is never going to be anywhere close to accurate.
So why put it in the data model? Think of it this way: A data model is just a proposal for how to organize data. It is not a law that says all information in the model must be completed. It is only a proposed standard way of expressing data and its relationships in a way that can capture the closest depiction of reality its authors could imagine.
I have worked with two very large data model standards. One for the hotel/travel/leisure industry. As you can imagine, hotel companies do not have much need for the portion of the data model that deals with cruise ships, and vice versa. But the data model describes a lot of details about both.
I have also worked with retail point of sales data model. I can vouch that the data modeled by a ladies apparel store is not the same as for an auto parts store. But there is a single POS data model that purports to cover most of the data that either type of store would need to keep. By the way, there is a good chance that Wal-Mart is keeping more than 400 attributes on you -- and way more than 400 data points if you shop there a lot and use a credit card or checking account that identifies you personally.
Nothing to do with parent's politics
So think about voting_status for a second. At last night's meeting, we were told that this was to keep track of the parent's party affiliation: Whose kids are Republicans? Whose kids are Democrats? Whose kids listen to Glenn Beck? This turns out to be inaccurate. The data model tracks whether the student is registered -- not the parent -- and not any political affiliation.
Bill Bennett was right when he said that the opposition to Common Core Standards is like the tin foil hat people. What typifies the so-called "tin foil hat" people is that they believe something weird without evidence of it and try to scare others into believing their ignorant and conspiratorial misunderstanding. That is what is happening when people make up stuff about the gov't tracking the political affiliations of the student's parents.