Sunday, January 6, 2008

atlassian has a stream of consciousness, hackystat should have on too

atlassian just put out an interesting blog; JIRA Studio: Stream of Development Consciousness.

basically, they want to expose a developers activity across different functions in one single place. its an outboard brain type of thing. see help me help you and hackystat and my outboard brain.

atlassian's attempt to do this is really cool because they are actually providing the functionality in an integrated fashion. the only problem with their approach is that it is limited to atlassian products. duh... hackystat can expose so many more hooks and so many more different types of activities.

we will get there one day. i just wanted to point out that where i once thought we were so a head of the game, in terms of metrics and reporting, people like atlassian is a head of the game in other areas.

wow.. i just had a cool idea. what if hackystat used open social to expose certain metrics. thats way cool.


aaron said...

wow.. okay, i wrote this blog and got up to walk away. then it struck me. i'm dumb. hackystat had a stream of consciousness long long ago. we had daily activity from day one.

i guess i took that for granted. haha. i'm actually a little shocked that i didn't see that.

so, i guess i've been thinking about this whole social networking type thing all wrong. hackystat has been social for a long time. we just didn't have the right user interface to display that information. nor, did i fully understand what we did have.

i have to think about this more...

aaron said...

duh... hackystat had that a long time ago with daily project data. wow.... that analysis already had hints of "social networking".

(phew... that realization makes me feel a lot better.)

anyway, i think we already had something that provided our stream of consciousness. but, i think we can kick it up a notch. here is how:

- create a concept of groups rather than projects. these groups means that you can see my data even though you are not in my project. but, you can't get access to everything. only stuff that i share.

- allow your group members to see your hackystat data that is filters some how. basically, show high level activity. for example, i wrote unit tests, i wrote an httpunit test, i wrote a class that used concurrent.util, i refactored, i used tdd, i'm using eclipse, or i'm hacking on lisp right now. these high level activities (minus the context; aka the file name) can provide an outboard development persona that others can be aware of.

- my blog mentioned open social. the idea would be that this high level activity stuff would be published using open social. (coolness!).

here is the point. we don't have to get all complicated and theoretical. lets do stuff that we can do quickly and try them out. we could spend years trying to think about the best way to do things. (i personally think hackystat v8 still needs an enterprise and cool looking user interface. the widgets and other things could branch off of that).

(haha. okay this is weird. i'm commenting on my own blog. debating with myself.. boo.. and yay at the same time).

Philip Johnson said...

Hi Aaron,

Great post. I am going to forward the link to the hackystat-dev list so everyone can reflect upon this stuff. (In future, please feel free to forward hackystat related blog postings to hackystat-dev for those who haven't yet realized they need to put your blog in their feed reader. :-)

Anyway. I think the fundamental difference between Atlassian's approach and Hackystat is that Atlassian's "atomic unit of information" is so much larger than ours. They think in terms of events like "commit event" or "blog entry posting" or "defect report", while we think in terms of events like "refactor a method name" or "unit test X was just invoked".

This makes Atlassian's life much easier, because their atomic unit of information is already at the right grain size to put in their Activity Monitor page. Put another way, how many total commit events and blog entries and defect reports get generated by a group of developers in a day? Maybe a dozen or two? That's fine to display in an activity log.

In contrast, Hackystat typically collects many thousands of discrete sensor data items for a typical project during a day. That's far too many to display individually in a page (and, most of those events would be noise to developers).

So, the challenge and opportunity for Hackystat is to produce abstractions of the raw sensor data stream at a level appropriate for display in such a page. This, to me, is an important part of the problem that we don't even question. And it yields us with opportunities (such as displaying someone's TDD percentage) that are not available (and perhaps not yet even conceivable) to folks operating at the grain size of Atlassian's commit-defect-blog entry unit of information.

The second point I want to make is that one of my current development thrusts for Hackystat 8 is to make both our analyses and abstractions annotatable and distributable. This is part of my thrust toward collective intelligence for software engineers. The idea is to go way beyond the kind of "read only" publication of activity that the Atlassian interface provides, and instead make those automatically collected observations the beginning of a conversation between developers (either via twitter, or Simile/Timeline, or some other kind of interface.) If successful, this would be a whole new paradigm for software engineering data, in which automated techniques are seamlessly combined with human interpretation.

We need a Ph.D. student to push on these issues. You interested? :-)

Pavel Senin said...

It's hard to comment after Philip said it all: I agree that new atlassian app looks like a hack - a cron job which aggregates new events coming from the set of their apps each two hours (I do the same using google reader).
It has a little in common with hackystat which provides you with quantitative data which you can abstract/aggregate and wrap in any "social" content.
Having data collected by hackystat one can develop dozen(s) of plugins which are data driven to post some annoying messages to facebook, blogger, feed, jaiku, twitter, sms, or old good (abridged) email.

For me question is do we really need this real-time consciousness? In other words is there any value in posting/reading these streams for developers?

mcannon said...

A hack? Ouch - that hurt! :)

FYI the data is not aggregated by a cronjob at all, it's all generated for you in realtime.

And the data format is not just limited to Atlassian applications, those are just the first applications off the rank. As customers request integration with application X, I'm sure we'll look into doing that (something like Jive Forums or OpenFire would be obvious candidates).

FWIW the more we use it inside a development team, the more angles we see that it could be taken (charts, highlights, aggregation, alerts etc). All in good time.

I agree with Philip that the granularity of action items is uber important. Even our granularity is too small a lot of the time, so the plugin actually "groups" action items together often - just like Facebook says things like "12 friends have updated their profile picture".

We'll keep working on it. Watch this space as they say!


aaron said...

hey mcannon, i'm looking forward to learning more about how atlassian is addressing bringing developers together. actually, it would see cool to develop hackystat integration points.

Pavel Senin said...

I'm sorry if it hurts ;). There is nothing wrong in fact, app looks nice and UI rocks. My conclusion came from the portlet screenshot posted in the "What does it look like?" chapter which clearly showing 2 hours interval between three aggregations: 6AM, 8AM, 10AM. Which in my opinion is a very good idea.
I'd like to take control on the frequency of updates and it would be nice to have some customized priority ranks attached to each update (by user id , issue id etc...) along with a filtering mechanism to allow important in my opinion updates to be posted in realtime.

Pavel Senin said...
This comment has been removed by the author.
Pavel Senin said...

Oh-oh, just found that it is Oct30 6pm, not Oct31 6am at the picture I'm referring to...

synthesis said...

here goes aaron -

as far as the daily stream of consciousness is concerned, people need to remember that not all data is only useful to everyone. therefore tagging is _required_ for this to be successful. an easy step is to tag by project. but one might go a bit further to implement "dynamic tagging" which maybe only shows events for projects with coverage above 80% in package* and*.

the problem is the user input. people are really lazy now days and don't really wanna spend a lot of time on what to do with the data pouring in. what i described above is a concise example but even that is too much work in my opinion.

the user interface has to be slick. people need to want to use it. hackystat has a stream of consciousness so it's up to the user interface to tackle that stream and make it into something simplistic which then becomes powerful.

now i'm looking at the words i've just wrote and realized that this may be completely obvious. so i'll leave with this idea. cut to the chase and innovate something better than just a "stream of consciousness". have a few of the dynamic tagging implemented for standard lookups and common queries and make them wicked fast through whatever means on the server-side. precomputed values in x min intervals is fine since the data is always historical. have a "start page" much like igoogle and show you those values. it'll be hot!

jsakuda said...

I'm not sure if it's the lack of brain function or my misunderstanding but at first glance I didn't really see the Atlassian activity feed app fitting with Hackystat.

It seems like with Hackystat we are concerned about the data more collectively. Like we don't really care for instance that I created a defect bug that causes infinite loop when action X is done. We just care that a bug was created. We don't really need to see exactly what bug it was I created do we?

Maybe I am not understanding you properly. I guess I didn't see it as being something that Hackystat NEEDED but it would be a cool thing maybe for the Hackystat group to use amongst themselves when writing Hackystat 8.

I suppose you could get the activity type feed to show that you did something like TDD for an hour or something. It seems like the Atlassian thing was meant to show things just as they happened with no further analysis. Where as in Hackystat we'd hate to see every little buffer change in Emacs, we'd only want to maybe see that there were 4 buffer changes in the last 2 hours. So to make it work for Hackystat we'd essentially have something analyzing data then throwing it up onto the activity feed. Which seems to perhaps defeat the purpose since something like DPD might be able to tell us that anyways.

I guess I'm not a huge fan of getting a bunch of updates in a feed. It might be useful if each person had like a compiled summary at the end of the day available for others to see but maybe not a real time feed.

Pavel Senin said...

Yeah, tagging is required , thanks synthesis for the post.

I'm thinking about timelines after all: Hackystat analysis works perfectly fine with asynchronous data transmissions from users (i can work offline for 3-5 days and than stream a whole bunch of data to the server and see updated telemetry after), but if there is a stream of consciousness concerned, how it'll work in such a case?

austen.ito said...

I'm going to have to disagree with Julie here. I actually enjoy getting information in feeds. Basically I see JIRA Studio as an automated Twitter. Twitter is awesome because you get to see quick like snippets of what is going on in someone's life. Now what if I wanted quick little snippets of what is going on in my project or what is going on with my teammate? Do I want to read a report like the version 7 Daily Project Data? Probably not. I want something fast, but I don't really know what I want. Getting information through feeds as a "stream of consciousness" is information that is small enough for me to quickly scan through. If I see something of interest then I can dig deeper through reports or charts to learn more.

Maybe I see streams of consciousness as filters for the large amount of data that I'm too lazy to sift through.