Wednesday, January 29, 2014

Social Media Bragging Rights?

I am beginning to get very irritated with Google Plus and Linkedin. Surprisingly more with them than with other SM sites (Facebook/Twitter in my case).

Why so annoyed? It seems as if they are developing giant social graphs which are devoid of meaning. I am added to several G+ circles per day. I don't use G+ for anything much, so why anyone would want me is beyond me. However I can imagine that as an entree to expansion of social graph and thus credibility.

How does it happen? G+ just has random people put me in their circles. No easy way to block it seems.

LinkedIn lands me on the add your mailbox to us page. It looks as if that page is in the normal flow, but it is bypassable.

Sad really because we can get value if you use it the way we want to, but as soon as the sites start behaving like this, their utility declines and people stop using them or drift away.

Thursday, June 6, 2013

How much classification is enough?

This certainly isn't new thinking, but it gave me a couple of aha moments. There seem to be 2 ends of the spectrum regarding information retrieval/storage/management. Here I am thinking about things like email, tweets, general information (trivia like).
Do you classify and organize - e.g. lots of outlook folders all arranged neatly with rules putting incoming messages into these folders? Or do you keep them in a giant heap and use search tools/techniques to help you locate critical emails when you need them?
It turns out that philosophies of manging email differ quite a lot from person to person. I used to be an organizer and classifier - until I realized that I could never set the rules up sensibly. Things would get "mis-filed" so it was hard work finding them anyway. Should I organize by date? Should I organize by from (in the inbox at least)? Should I organize by keywords (Oh dear which an how many copies do I need to keep)?
Of course I could take an entirely different approach. I could pile all the mail into a single folder and have very potent tools. But that also puts some significant burden on me. I have to remember the words that might have been used - and their spelling! If I get an email from an English colleague and s/he is talking about organisation and I want to correlate that with some observations from an American colleague, it is going to be tough using just a search method. Search is improving, but still it isn't easy yet.
My natural tendency is to "classify at the moment of use" - i.e. to base my filing "system" around search. That extends to physical filing too - one look at my desk would convince anyone that I don't organize! Of course that drives madame nuts, but that's a story for a different time.
Which are you? Do you tend towards one end of the spectrum or the other?

Friday, January 25, 2013

Context is king in signage

I have started a new project recently. It is a very lkarge one - about 30 work packages, lots of people and significant complexity. As with all endeavors, we have to learn the landscape and learn it fast. After all we are paid to be productive, not to learn. Similarly when we get to a new city, we have the objectives (conflicting at times) of finding the hotel and learning the city.
The big aha observation is that all sign posts are created by people who know where they are. They are not necessarily created for those who want to find things. That's (perhaps) why we have maps and why navigation using something like Google maps works so well. But I digress.

On this project, the work packages are imaginatively numbered wp101, wp102.. up to wp120 for the first release. And then wp201... for the second. Work packages have calendar based deadlines. And of course we have times of day to worry about too. Especially during the work day/early afternoon.

So when there is a conversation in WP116 saying we need to move "that" to 220 what do we mean? Is it move it to wp220? Is it to move the due date out to Februaray the twentieth? Or should it be discussed at the 2:20pm status meeting?

Of course when you have the context, you have enough metadata to figure it out. But when you are a stranger in project land, it requires considerable mental energy to figure it out. The signposts (220 in this case) are used by the people who know, and those of us trying to use them for direction scramble to keep up.

On projects like these, it is not so much of an issue. But in the "real" world, putting yourself in the mind of all of the user constituencies is important. Is this sign put there because the law says we have to have signs? Is it to help natives who need to find quick alternate routes? Is it to embrace strangers/visitors? All of the above?

Monday, January 21, 2013

There's a place for everything and everything in its place - except when there isn't

Charles Goodrich (1790 - 1862) popularized the epigram "A place for everything and everything in its place".  http://en.wikipedia.org/wiki/Charles_A._Goodrich. However I suspect he wasn't thinking about what to do with data. We are accustomed to thinking in hierarchies, that our data are neatly arranged - like an English formal garden. But in reality, what we have is a wild  and untamed data wilderness. We never know what nuggets we will pick up where. What "gifts" flocks of birds excreting seeds will deliver allowing new things to sprout in unusual places.
Organization and arrangement are fantastic and efficient for transactional activities - those things where we, by force of will have designed things. But where we have non transactional, undesigned things we have to adapt to what we find.
This distinction between adaptation to what we find and purposeful design is crucial. If we attempt to impose purposeful design over things we find, we commit a variety of sins:
  • We abstract too early - ending up with meaningless abstractions. After all the ultimate data model has one box and one line. Thing is the box. Relates to is the line. Then we end up with a series of triples. But we can't see any reality in there.
  • We name things that don't really have names. But categorization demands it.
  • We box ourselves in - we put "everything in its place", without considering that actually there are multiple places
  • We throw stuff away because it doesn't fit our notion of predined categories.
We are now in a "big data" world, where we have the opportunity to keep more stuff, where we have multiple classification systems, where we can draw inference from meta-data observation. Yet we still try to impose "design ahead" schemata on things outside of our control. This is often where master data management, large data modeling projects and other attempts to impose order often falter. We argue over nits, create beautiful (but sometimes irrelevant) abstractions.
Let's not forget that there is a place for order. But also remember that sometimes things just are and it is our job to interpret rather than to order.

Things have changed a bit since 1999

I am Chris’s pager. You know me as 555.818.1234 . March 9th. 1999 was the most traumatic day of my being. I had faithfully woken him up at 5am, had the screen window cleaned, and been put in my place of honor at his left hip. We left home to go to the airport for a day trip to Houston. Trips to Houston are always exciting because I get to show my wares many times in a day.


Flying however is not my favorite activity because he always stuffs me, along with my great rival the cell phone, into a dark, black bag before going through security. He explained this once, but my battery was low, so I switched off.

On this morning, he stuffed me into the bag, and unceremoniously dumped me into the overhead bin of a Southwest Airlines plane to Houston. During the flight, I was paged, and managed through vibrating hard, to escape into the bin. The bin was clean, and uncluttered – and best of all, the cell phone was still in the bag. The cell phone was less trouble than normal, it had been switched off for the flight.

The flight landed, and a somewhat groggy Chris pulled down the bag, and didn’t notice that I had escaped. I vibrated at 60-second intervals to try to alert him, but to no avail. He ignored me.

So there I was. Abandoned in a strange city. In a strange compartment thinking I would never find him again. Every time I was called, I hoped it was he, but no he seemed to have forgotten all about me.

I can only imagine how frantic he was without me. I kept getting called – almost 20 times in the day, and knew I was important. The cell phone had pride of place that day, though. I hoped that Chris wasn’t going to think that I was dispensable, and rely totally on the cell phone in the future.

Meanwhile, the plane was cleaned and took off on an odyssey across country. We even left Texas and went to Rhode Island. At the Providence airport, a cleaning crew member found me, and handed me to lost and found. I heard those clever people at South West airlines wondering how to reunite me with Chris.

One of them turned me over and deciphered the number tattooed on my back. She called it, and was given Chris’s name. They looked in some thing they called the reservation system, and found his reservation with our phone number in it. Luckily it was our home and not that cell phone’s number. Several people tried calling but there was no one home to answer. My worst fears were realized. I had been tossed aside and replaced. And then late that night, the phone rang. I heard the familiar voice. Chris called! He was worried about me after all. He wanted me home. He authorized the use of his credit card to have me shipped. He left a long message warning that if the Southwest people were to try to call him at work, I would vibrate.

I passed a more peaceful night than I expected. Still I wakened with a ringing in myself at 5:00am. It was no help though. I was still up North where it was cold out. I spent a fretful day, and eventually found myself on another of those beautiful Southwest Airlines planes going home to Dallas. Every time the plane stopped, I hoped that I had arrived, but no. In fact it was worse, I was going to Houston. I didn’t know the plan had changed and that Chris was going back to Houston that night, and would rescue me.

I sat quietly in my box, not knowing when he would pick me up. And then, I heard his voice. He was coming to get me. I had not been abandoned after all. We had a happy reunion. He illuminated my display to check the pages. I felt needed again. He put me back in my holster, we were a team once more.

As we left Houston the following day, the cell phone’s battery ran out. I was again the only way to reach him.

Monday, December 3, 2012

Grumpy Old Man Plays the Role of Query Optimizer

Another in the continuing saga of MongoDB. As you maye gleaned from other posts, I am yet to become an all in fan of the product. I have some appreciation of its capabilities, but still am finding it syntactically and semantically hard going.
The class from the 10gen team is very well done. a few glitches, but nothing major. I have been on expensive classes where the material has been of lesser quality.
So kudos to 10gen - especially Andrew.
The strange bits. And this week there are 2. First off there is thing called the "aggregation pipeline". Cool concept, you specify as steps in a pipeline what to do with the data, so you can do things like group, sort, and generally report usefully on the data contents. Cleverly (of course), the results from each stage of an execution pipeline are mongo (JSON) documents. So you can operate on them just like any others. Nice.
But, in order to do quite complicated queries, you have to specify every step of the pipeline yourself. And you have to figure out the order you perform them in(to get the right results and to optimize the performance). If I sort before I filter, then I am probably doing too much work, for example. In SQL databases - at least the better versions, the query optimizer is supposed to figure this out for you. So, niow I am having to be my own query optimizer. Not happy about that. Yeah there are reasons, sharding might be tricky to optimize (I don't know).
The second - and this is quite uncomfortable feling, but I dare say I will get used to it is referencing. This is a bit tricky so I will illustrate (I hope correctly!)
if I want to group by the value of category, where category is a key in a document,
I would have to right something like {"group":{"_id":"$category"....}}

I read this as make the _id field the value obtained from the category key. Each one in quotes (programming in strings again, uh), and then the $ to dereference the name of the category key so I can use its value. That's an awful lot of symbology to remember.

Grumpiness quotient has gone up this week!

Tuesday, November 20, 2012

Grumpy Old Man and MongoDB - Indexes and things

This week (week 4 in the excellent 10gen class on MongoDB) has us looking at things like indexes, profiling, etc.
I am getting used to the syntax (but still dislike the "programming in quotes" model and the use of cryptic special values for specifying sort sequence, etc.).
Lovely looking feature for geospatial indexes, but quite tricky to use. At the base level, the distance measures on the spherical model are expressed in radians. So we have to do that conversion somewhere. PITA so far. I can see why, but that isn't exactly habdy. Would like (and will build) some other mechanisms to sort that out.
If for no other reason, the radians based model doesn't distinguish well between directionality. maybe I want coffee shops within 10 miles North of me (because I am heading that direction, none south of me and maybe within 1 mile each side of the route). I am sure I could code that!
And then for some reason, the MongoDB shell treats using the geospatial spherical model differently from other models. It is invoked through the db.runcommand(...) syntax and not the usual db.dbname.find(...) syntax.
Also since you don't specify which index to use in the db.runcommand(..) syntax, and if you happen to have 2 2d indexes defined it fails. Promising feauture, but could use work.
Utilities are handy - Mongotop and Mongostat are helpful indeed
In many ways MongoDB reminds me of the 1970s system ADABAS, but with updated syntax.