HackyHawaii For Life by AK: November 2007

Saturday, November 24, 2007

sicko, google, and grass-tech-roots

sicko
i just finished watching SiCKO. great movie. i don't care if it is potentially one sided. at least moore is sticking up for what he believes in.

there are a couple of things that bother me about these types of films:

a movie like sicko doesn't get into the mainstream. that is crazy to me. a quick little story: i bought an inconvenient truth DVD months after it was released (but i really wanted to see it, but never got around to buying the dvd), thinking that everyone should have heard about the movie by now. i went to a family party and asked, "hey, i just got inconvenient truth, what did you guys think of the movie?" to my surprise, only one family member knew what i was talking about. i was shocked. so, i went to work and asked the same question. this time a few more people knew about the movie, but some where completely oblivious to the who thing. wow, i said to myself. people just don't pay attention to this kind of stuff. hmm...

critics... there are too many critics. blah, blah, blah... i just happened to read the readers reactions to al gore winning the nobel peace prize. i hate critics (haha, irony). they say "blah blah blah, gore sucks, blah, blah, blah". instantaneously, my brain translates that to "haha, haha, haha, you suck way more, you dumb critic, haha, haha, haha". like i said earlier, at least gore is trying to do something. even if it really does suck, then you, for not doing anything, must suck way way more. thats how i feel. just take these movies for what they are worth - the entire purpose is to raise awareness - and gore did an excellent job.

anyway, back to sicko. sicko makes me feel like america is the odd ball. i get the "why can't we just get with the program" feeling.

google
google is the exact opposite. google is the odd ball, but in a good way. i get the "why can't every company be like google" feeling. this is strange, because almost everything google does seems to be open to the public. google is awesome because of what we hear about them is awesome. (haha, why can't yahoo copy that? why can't anyone else copy that?) for example, they they focus on global justice. here is a cool quote:

"Imagine somebody saying, 'You know, our greatest asset is our employees,'" said Cohen, a political theorist and professor of political science, philosophy and law. "Imagine, number one, that it's true; and number two, that they take it seriously. It's as if that's what's going on at Google. I don't think any company has done this.

(i took this a little out of context so you'll have to read the whole article yourself) imagine a company taking its employees seriously. wow, what a concept. here is more:

"It's been a unique experience to figure out how a company like Google could make a difference,"

"We want Googlers to understand the issues," she said. "A lot of times we probe speakers about how Google can help. Googlers are always coming up with solutions."

you know what, i'm sick (sicko... haha) and tired of hearing that googlers can do this and can do that. why can't i! why can't every company have a .org (as in google.org) associated with it.

anyway... google seems like the best place to work for many many reasons. haha, one of my friends couldn't come up with one reason why he worked at his company. boo to that. why can't his company be awesome like google? is it impossible? does google have the monopoly one awesomeness? i think not.

tom hanks
so, i found out that tom hanks has a myspace page (haha, i found that out by watching oprah). anyway, what kinda started this whole blog was that i was noticing how people are using tech to spread the word. a kind of grass-tech-roots kind of thing. sicko and inconvenient truth did it with blogs, comments, votes, badges, etc. tom hanks does it with myspace pages. so does leo - and i was reminded about his film the 11th hour. grass-tech-roots (i just made that up) is cool. grass-tech-roots is the kind of thing i was hinting at before in my previous post.

another anyway...
i'm not getting all political on you... but i think the google article puts it best. "Think Globally, Act Googley." that is awesome.

Monday, November 19, 2007

is everyone frustrated?

i often get frustrated with technology. frustrated that people don't think of better use cases, additional use cases, and identify that special piece that could make something great. after a short period of frustration, i sit back and wonder why i just got so frustrated. i wonder why doesn't these things bother other people. i feel like the only one that realizes that this sucks and that it could be so much better. being frustrated with technology isn't good. its frustrating that its so frustrating.

well, come to think about it... i get frustrated a lot with a lot things. and yet i get over it so quickly.

Sunday, November 18, 2007

implementing the clover sensor

i just finished an initial implementation of the hackystat ant-based clover sensor, which process the xml clover report to send to hackystat. as usual creating an ant sensor is pretty easy assuming that the report that the tool creates is in xml and makes sense.

here are some issues that i came across.

different attribute names
after writing the sensor, i noticed that the attributes names are different than emma's. for example, clover generated Coverage data has statement_Covered and statement_Uncovered. however, emma generated Coverage data has line_Covered and line_Uncovered. i addition, because there are no required attributes (other than timestamps, type, etc) the Coverage DPD needs to be "oblivious" to the different granularities. this all brings up issues with DPD implementations, will the DPDs contain the last snapshot per tool? if a project uses multiple coverage tools, you might have a random sampling of last batches in the DPDs coming from different tools. that would make it really hard to use the DPDs over time. it seems that the generality of everything that is going on is going to make it harder for the analysis writer or even user to know what is going on, because they would have to know about the details. i thought the whole idea of the abstraction hierarchy was to make "abstractions". let me just throw this out as well, what if we wanted to combine emma's coverage values and clovers coverage values (they are a different set). there is no way to do that.

issues with the clover ant task
it actually took longer to configure the clover tool with the ant tasks than it took to write the sensor. in fact, i wasn't able to configure the ant clover tasks the way i wanted. so, technically i'm not done. for some reason i wasn't able to clean before doing the clover-setup task. i don't get way that is. but, i won't bother with it for now.

our test cases are junk
i need to work on those test cases. basically, i think following this will help.

clover can send filemetric data
clover's data looks like this:


<metrics classes="1" methods="18" coveredmethods="7" 
  conditionals="36" coveredconditionals="11" 
  statements="141" coveredstatements="77" 
  elements="195" coveredelements="95" 
  ncloc="270" loc="456" />

to me that looks like coverage data and filemetric data. so... would it be wrong to send both?

Saturday, November 17, 2007

chat about smart snapshots

the hackystat gang just finished up a long thread about large datasets in DPDs. i fought long and hard in that thread to no avail. but, i understand the competing idea and don't necessarily think its bad. both solutions have its advantages and disadvantages.

through chatting i kinda uncovered something else that i kinda don't like. a disadvantage in my mind - but not necessarily wrong. here it goes:

aaron: where is the smart snapshot?
aaron: the smart thing is going in reverse order?
austen: right. since snapshots are the latest data set
aaron: so you get thirty minutes chunks.
austen: ya
aaron: but what if the timestamps just so happen to span across the thirty minutes?
austen: then u would be done
austen: im not sure wut u are getting at
aaron: hm...
aaron: timestamp 1:59
aaron: timestamp 2:00
aaron: timestamp 2:01
aaron: all part of the same runtimestamp of 2:04
aaron: if you get timestamps from 2:00 - 2:30
aaron: then you are missing part of the data.
aaron: you have no idea whether the data actually spans more than 30 minutes.
aaron: the batch could be hours and hours long.
aaron: but the runtimestamp could be at 2:04
aaron: you'll never know when to stop.
aaron: so, i'm saying that algorithm is an approximation.
aaron: unless there was a way to say get all data with runtimestamp = 2:04
austen: wouldn't u just loop backwards till u find the end of the snapshot?
austen: within the specified time span
aaron: when do you know its done?
aaron: what is all the data?
aaron: theoretically you'll never know if you are done.
aaron: in practice i guess you could say if i don't get any data in the next bucket i look at then i guess i can call it done (aprox).
aaron: still don't get it?
austen: u know its done when u are getting data whose run time is different.
aaron: not.. necessarily.
aaron: because batched data can be mixed right?
aaron: you could be sending data simultaneously
aaron: from the same user
aaron: intermixing data with different runtimestamps but very similar timestamps
austen: if its from the same tool invocation , the sensor will send runtimes that will be the same
austen: so no
aaron: what...
aaron: what if i had two windows open. run full-build at the same time.
aaron: the windows one is a better example.
austen: are u saying that it is broken cause the data will not be ordered?
aaron: the data will be ordered but based on timestamp
aaron: not runtimestamp
austen: yah. ok I see
aaron: basically we are doing approximation.
aaron: we are hoping that the 30 minute chunk has all the data.
aaron: and that it didn't actually span over 2 hours
aaron: becasue there is no way to actually know if it spanned over 2 hours
aaron: which is fine... i suppose.
aaron: that is what happens when you have smart on one side and not flexible on the otherside.
aaron: we need smart and flexible to be optimized and exact
aaron: right now we are sort of optimized and approximated. which is fine.. but people need to understand that is the case.
aaron: and it will be harder to debug if the approximations aren't that great.

to sum up, the current "smart" snapshot idea will definitely speed things up a lot. however, i think there are scary disadvantages. it is an approximation - theoretically you have no idea how big snapshots are. so, when can you actually stop? in this method the be exact you would have to search the entire day to be sure you have every single data point. but, we probably won't do that. in fact, i think people will implement it differently. so there will be different approximations. that might be bad. approximations might make it harder to debug. but on the other hand, it isn't wrong to do approximations.

anyway... that is all i have to say about that. i'm hoping that the performance issues will be solved pretty quickly with the methodologies that we are adopting.

spreading good things

free rice
because of my recent post about the XO laptop, someone mentioned that Free Rice is also a good cause. check it out.

haha. those words are hard. try it out and send this to a friend. andrea has already beaten my top score. i guess i suck.

green your work
from treehugger.com check out how to how to green your work. i like the suggestion of satellite offices to reduce commute waste. i'd bet that could work out cheaper for some companies. imagine making a 3 people small offices in wahiawa, ewa beach, kapolei, waimanalo, etc. i'd bet it would be cheaper than renting one big office in downtown. not to mention instantly improving the quality of life for your employees that live in those areas by saving hours of commute time everyday!

Friday, November 16, 2007

one laptop per child

i know that everyone is talking about one laptop per child. but, i think its important to get the word out and a little perspective on what people actually think about this. so here is what i honestly think.

i think it is a fantastic idea. i think it is expensive but who cares if you change a life. i think i want to help out and get one. i think i'm not sure if that is the best use of my money. i think i should use my money for something like this. i think if i was a kid that couldn't afford stuff like this, then i'd really really be happy and thankful for getting one. i think this could change the world. i think the laptop isn't that cool, but really cool at the same time. i think there are kids in america that could use this. i think bill gates should by a million of these and donate it. i think it might be a little too slow to really use. i think its cool that everything runs off of python. i think i would split this with someone. i think i would like to give it to the kid. i think i would love to see a documentary about this. i think all laptops should come with all the harry potter ebooks. i think all laptops should come with an encyclopedia. i think it would be great that one day it was 50 dollars for one laptop. i think one laptop per child is a lot of laptops. i think where do kids in third world countries plugin to electricity. i think the laptop might get stolen. i think i wish i had this when i was a kid. i think it is a fantastic idea.

i think every child that wants a laptop should be able to get one. i think this is cool.

Thursday, November 15, 2007

learning how to run code reviews

yesterday, i sat in a fantastic code review moderated by austen. to some degree for the past few times we have been working on his ability to moderate code reviews. let me just say that it is really working out very well.

in my opinion, moderating code reviews is definitely a skill. if you don't think so, then i wonder if you have moderated an effective code review. to be the moderator you need to make sure that the time is spent effectively. also, the most important thing is that you make sure that everyone is learning the whats, whys, wheres, and hows of what is going on. another thing to do is to make sure we don't focus on superficial issues and identify where we can make tangent discussions about design. also, leading the discussion in a positive direction is very important. moderators that just say, "needs fixing, follow the coding standard, moving on" are really not doing an effective job.

for this specific code review we spent a while explaining why coding standards are standards by identifying why specific "rules of thumb" have been created. we also, were able to shift the discussion (almost accidentally) to a larger more important design discussion (a totally awesome design discussion).

anyway... i'm fairly confident that austen has the ability to lead effective code reviews. i really think that it is an invaluable skill for a software engineer that takes practice. this skill combines a lot of the soft skills that i'm talking about.

in my opinion, it takes a deep understanding of not only knowing how to design software but also how to help teach that understanding to other people. that sort of skill separates austen from others.

Tuesday, November 13, 2007

metrics and clouds and other cool ideas

so, i've talked about atlassian's clover metrics before:
two roads and in metrics, metrics, metrics

so, you would think i should move on to another subject. but i won't. so too bad. haha. anyway, i found this on the clover website:

here is what they say about that graphic:

The Project Risks Cloud highlights those classes that are the most complex, yet are the least covered by your tests. The larger and redder the class, the greater the risk that class poses for your project.
...
This Cloud highlights the "low hanging coverage fruit" of your project. You will achieve the greatest increase in overall project coverage by covering the largest, reddest classes first.

wow.. okay that is a cool idea; metrics clouds. but, again i think they (atlassian) is trapped by limitations of their framework and infrastructure. just doing coverage to identify low hanging fruit is probably really really basic, not to mention potentially meaningless and might actually do more harm than good. its like doing Priority Ranked Inspection with just one metric. nevertheless, the cloud idea is a good one. and yet again, hackystat could do much better.

re: The Shade Tree Developer

Jeremy D. Miller -- The Shade Tree Developer wrote two blogs today. i read both and i agree with a couple parts and disagree with another. here are two that i like to highlight.

cubicles
in little observations jeremy writes

Cubicles == Collaboration Proof Force Field. Is there any worse way possible to arrange a development team?

woah.. i've never heard that from any software developer before. i wonder if he read things like Peopleware. anyway, i realize that offices are the best possible situation as peopleware points out, it really depends on how your cubicles are situated. if they look like this then i doubt he would be complaining about them. anyway... i'm not sure jeremy knows what its like without the walls.

invest in people not the tools
from A Train of Thought: November 13th, 2007 Edition jeremy writes

Invest in People before investing in Tools

I see sooooooo much effort and money going into producing or purchasing tooling that will "enable" bad or undertrained developers to write software with adequate results. Software factories to tell them what to do next. Methodologies try to straitjacket developers into being spec programmers. Tools that frankly have no power because the makers are favoring safety to keep developers from hurting themselves. All powerful frameworks that try to do ease development by leaving developers very little choice or freedom. Yes, the average developer might be underskilled and undertrained, and we generally need to do something about that to make them more effective. My constant contention is that we'd be better off to raise the average developer skill level across the board. In economic terms I think it's cheaper to invest more in developing developers than it is in fancy tooling.

What is so wrong with our value system that we favor using tooling to make people interchangeable instead of investing in people to make them, and us, more effective?

this serves as reminder of what not to do with hackystat. hopefully, hackystat will never be pitched as a straitjacket for productivity. i've said it over and over again, people and their unique skills are the most important thing in software. no tool, even hackystat, will ever replace a person's skills and knowledge. i don't get why tool marketers can't just stop with the crazy silver bullet talk. it just makes smart people angry.

0-4 worse than striking out (inside joke).

Monday, November 12, 2007

re: "Blogging, Open-Source Software, and Rockstars!"

here is a response to austen's blog: "Blogging, Open-Source Software, and Rockstars!"

i totally agree with what austen said about why developers could be blogging. however, they could be doing the "software design on paper" but in another medium. so, i offer this as another reason. this is the primary reason why i blog. blogging helps share your thoughts with others to help collectively grow knowledge. the sharing part of blogging is really important.

i blog because i want to help provide information to certain select groups of people. the side effect of doing that is the things that austen mentioned. the developers that see no value for blogging, probably don't see a value of sharing their thoughts. that is a real shame.

finally, i think a lot of developers thinks that their only job is writing code. i totally disagree with that. there are so many other skills that one needs to be a rock star; programming skills are just part of it. its maybe the least important one; don't get me wrong i think it is a requirement. soft skills are very important. if one day i'm called a rock star programmer, it won't be because of my hacking skills. i have so much more to offer and aspire towards.

Sunday, November 11, 2007

why i work on hackystat

hackystat is an awesome project. and i work on hackystat for three reasons. but, the priorities of those reasons is probably not what you would expect. here they are:

i want to help lead my peers - part of what i do and what i really love doing is helping the "younger" or newer hackystat developers understand what hackystat is all about. actually, it is the soft skill mentoring that is really motivating. i have the chance to help other hackystat developers grown their critical thinking about software engineering research. thats fun and thats what it is all about.

i learn a lot from my peers - typically everyone that works on hackystat are better than me in programming. yes, i even think that the "younger" developers are better at programming than i am. i learn a lot from them. i also learn a lot on how to be a leader, by following leaders like philip. i also love to hear, "aaron, that is stupid", every time i hear that makes me better.

(the last reason) i like the actual application of hackystat. i like that hackystat is successful and can be even more successful. i like the domain that we are in, because i like software engineering research. but, i put this last because the other two are much much more important to me. i'd probably be as involved if hackystat was a chat client.

so, you see that it is the people and the environment that makes the biggest difference for me. if you are working in a crazy cool project with dumb people, in a dumb environment then i bet you are hating it. i would. people make all the difference. moral of the story: people is what makes software engineering awesome. but people can also make software engineering a pain in the ass.

things that i want from google

google has a lot of cool applications, but i want more functionality from their apps. here are a few ideas.

google earth - i want to be able to classify information by geographic locations. a similar idea is what i've seen with the mappings for the southern california wild fires. but, i want it much more dynamic. for example, if i "google" migration patterns of humpback whales, i want it to show me based on the research where humpback whales migrate throughout the year and where they are right now. that is cool. this could work for all kinds of things. for example, where is it daylight right now, what countries have AIDs epidemic problems, or where are the Red Cross helping right. all of this, of course, is to add much more context to information. it would definitely help me grasp the scale of some of these problems. it would also be cool to use this in education, for example Science on a Sphere
at the Jhamandas Watumull Planetarium - Bishop Museum

google reader - i want the concept of a "i have to read this later" category. i always read a little of the title and the first few sentences and i know i need to spend more time reading it. often these reader entries are longer than usual or contain other multimedia inserts like a video. being able to "star it for later" seems like a very good feature.

google reader (2) - i want to know what blogs my friends are subscribed to. some how being able to do a grouping of the blogs my friends are reading seems cool to me. it could define the type of information my friends care about. for example, my software friends would have an entirely different set of blogs than my high school friends. but, what would be interesting would be to see if there were cross-sections between my groups. if that was the case, then another group should be formed.

blogger - i like the idea of creating a community of independent bloggers. this page, kinda like google groups would be able to aggregate their blogs in a reader type thing. stats on how often the group writes sounds pretty cool. also, things like tag clouds can help define what topics the bloggers write about.

gmail - gmail should have more features like how apple's mail works. the notes and todos are cool stuff.

picasa - picasa is cool, but it isn't that great at the community aspects. the desktop app is really kind of boring. there aren't very good ways of finding pictures or scrolling through them. zzzzzz

i just had a thought.. a lot of the cool features are happening in apple products. hm... that is interesting. while google is making things really accessible, in my opinion it looks like apple is winning my vote for more usable and useful. wow, and i don't even have a mac. google team up with apple would be the killer app.

chat help for testing

i was looking at some of the test code for the emma ant sensor. i didn't like how the thing was being tested, but couldn't really grasp how i would improve it. as usual, instead of wasting my time thinking by myself, i turn to someone for help. i've always find that explaining what i need help with to someone else almost always helps me solve the problem. not to mention that putting to heads together almost always solves the problem faster and better.

aaron: testing question.
aaron: the ant sensors are very procedural.
aaron: give it the file. it parses creates the keyvaluemap, adds to sensor shell and then sends.
aaron: testing something like this would require.. the test code to implement some parts of the procedure.
aaron: for example the test code would have to get the jaxbcontext then go through the procedure too.. to get to the keyvaluemap.
aaron: that seems really bogus. and is why the sensors aren't tested good.
aaron: how to fix that. ideally you would want to inspect at a certain period in time. like aspect style.
austen: i never learned aspects, but what you are saying sounds like what you would want to do conceptually. but you have to create some type of state in your testcase in order to figure out anything. this goes for alot of code you want to test. right?
austen: that bogus "fake" state is what you dont like?
aaron: ah.. so make a mock object
aaron: so build jaxb objects
aaron: send it in.
aaron: then get the results.
aaron: in parts.
aaron: i guess you can't test the whole procedural process.
aaron: thats a system test.
aaron: sound good?
austen: well i think in your case, you would just have some testdata
austen: and you would let the sensor run on that to create to the jaxb objects
austen: but more or less yes you wanna create some mock jaxb objects
aaron: okay.. i got it.
aaron: test data -> send to method -> get jaxb objects -> send to sensor logic portion -> get back keyvaluemap -> analyze values
austen: yah thats how you test if it loaded correctly if you had access to that keyValMap. but you gotta figure out a way to get access to it
austen: a problem with the sensor testing is you dont know if it gets sent
austen: and that sucks
austen: you kinda say ok it loaded it into sensorshell ok
austen: but is your sensor really sending it?
austen: *shrugs*
aaron: thats a system test.
austen: ah i see
aaron: k. i think i got it. good idea bouncing session.

this was a good discussion and got me on track on what i wanted to do. these are the types of little collaboration that need to be shared across a team. its is not profound, but it helps keep the software development at a high level by introducing 10 second vignettes of knowledge.

Saturday, November 10, 2007

work on your soft skills

in the last couple of weeks, i was hacking like a mad man for a little while and then dropped down to doing higher level and management stuff. what i'm constantly seeing is that hacking and the other stuff is equally important. in fact, doing the other stuff is what i think separates myself from the others.

so, all you students out there. (and i've said this over and over and over and over). hacking is part of it. yes, you have need to have the hacking skills. you need to have good software development processes, you need to use the right tools, you need to learn things like Programming Pearls and Effective Java, bottom line you need to be able to hack with the best of them. BUT, you also need the "soft skills". Skills like communication, writing, team work, critical thinking, speaking skills, management skills, following skills, even stuff like understanding the market, etc. that other stuff is very important. and i would claim it is much harder to practice the soft skills. you can't exactly read a book about how to make a presentation and think you are guy kawasaki. soft skills need practice. you need to start practicing them now.

here is what you do:

write research papers, the best way is an undergraduate or graduate thesis.

give as many presentations as you can.

work in as many project teams as you can, hopefully, you are in projects that you can lead. but, also be in projects that you are the follower - hackystat is a good project to be a follower and leader.

i've heard other people say you can't learn the soft skills. i think that is bunch of baloney. it just takes longer and is harder to do.

start doing it now. soft skills are extremely valuable in industry.