Free investment tip.

Do you want an investment tip?  If I had a few million dollars lying around I'd start looking for a startup aiming to write software to support the upcoming design, fabrication and manufacturing revolution.  The sort of software you need to prototype, test, and yes, machine or 3D print physical objects.

Sure, there is already lots of software for this on the market, but it is extremely complex, expensive and most of it, if not all, is targeted at Windows.  Makers don't use Windows.  Makers use OSX or Linux.  And it is the Makers you have to keep an eye on because they are the people defining what is right now a relatively small niche.  Even if you ultimately will sell larger volumes on Windows (though I am not so sure you will).

The reason I think this is going to be a good area to invest is because the software houses in this industry are going to miss this market entirely.  They are, after all, professionals with decades of experience and wouldn't even think of licensing even scaled down versions of their ultra expensive CAM software for calculating optimal toolpaths at prices affordable by mere hobbyists.  Much less engage with them to externalize innovation and have mere users extend the ecosystem with their own extensions and tools.

As manufacturing becomes available to the masses (both in the shape of cheaper machines, but also in the shape of companies selling access to equipment through the net), in much the same way computers slowly became available to the masses in the late 70s, there will be a need for better, more user-friendly, more portable, and more extensible and adaptable software.  Because that is the big hole in the ecosystem right now:  the software.  There are some cheap packages for doing rudimentary 3D design.  There are also some packages to drive mills and 3D printers.  But there is a very noticable absence of decent CAM packages to bridge the gap between those two worlds.  That is: for the sort of audiences that are just now starting to enter the stage.

I predict that within the next 2 years a startup will emerge from the Bay Area that addresses this gap and makes an affordable CAM package that runs on something other than Windows.  In 5-6 years the industry giants are going to be scratching their heads and asking themselves how on earth they could not have seen that coming -- and people will point out to them that history just keeps repeating itself.  They will miss the market precisely because they think they know everything there is to know about their particular domain and they will be too heavily invested in the status quo to adapt.  This happens in almost every industry from time to time.

There are also going to be a lot of innovation in the type of fabrication processes that are possible and practical.  From advanced voxel machines to manufacturing living tissue.  This is going to require flexibility and hackability.

So if you have a few million dollars lying around and you want to invest in something that may become a really exciting future market, I suggest you pack your bags, go to the Bay Area and start looking for startups that want to build affordable CAM-software. Of course, you'll have to be in it for the long haul, but I think there are fortunes to be made here.

Through sites like http://cloudfab.com/ you can already fabricate one-offs at relatively low costs.  These types of companies will also be very important in helping you figure out what fabrication processes will be appropriate for you since this area is experiencing very rapid innovation.  But you still have to provide them with decent designs and that requires proper tools.  
A package like SolidWorks is going to cost you something in the neighborhood of $4.000 to $5.000.  It is a niche product for industrial use, though a bit overkill for the hobbyist.  Of course, it is also quite obviously too expensive for the $50-$500 price range that a hobbyist might be able to pony up.  Judging by the number of hobbyists I've encountered that seem to know SolidWorks and similar packages without having a job where they'd use it, I'd suspect that there are a lot of pirated installations of SolidWorks out there.

Oh and of course, SolidWorks is neither available on MacOS X, nor on Linux.  Which is a big handicap even if Dassault Systèmes were to create a "Maker version".

If you want to go into the business of building these tools I wouldn't worry too much about being undercut by traditional players.  By the time they catch on and slash prices they'll have nothing worthwhile to offer to the "amateur" segment anyway.  They'll sit on the fence until it is too late.


There is no Web 3.0

People keep talking about Web 3.0 expecting the sort of significant shift that Web 2.0 represented.  I don't expect there to be one.  I view Web 2.0 as the transition where certain aspects of the web "went hockey-stick".  Where certain potentials already inherent in the web were realized -- mainly that it went from being a one-way medium to a many-to-many medium.  Also we saw the emergence of businesses that "get" the web.  But also small things like Firefox challenging Microsoft and bringing actual progress back into the stagnant world of web browsers.

The Next Big Thing.

So what is The Next Big Thing?  The next significant shift?  I think it is about the physical world.  About machine to machine communication, about "dumb" devices becoming "smart", or lacking that, connected.  About sensor networks, remote control and about everyday devices interacting with each other in useful ways across vendors and types of gadgets.  About devices and gadgets becoming addressable in sensible ways.  About connectivity coming to devices that we previously thought of as too cheap or too small for it to be cost effective to put communication and processing capabilities into them.

Very little of this is new.  Machine to machine communication is already an established art, but it has lacked openness and the right sort of mindset that is needed to fuel explosive growth in the consumer market.  It is still a specialty field.  And connected devices are still somewhat of a novelty.

If you consider trivial examples such as your pulse monitor.  You have a sensor you place on your body and a watch that can accept data for storage and processing.  If you work out at a gym they might have a treadmill that can accept the signal from your sensor, but that's about it.  Nothing else talks to it.  If you want your mobile phone to interact with the sensor you need to do a bit of hacking.  And it isn't for everyone to design their own electronics, figure out the protocols and write their own software to do this.

Various makers of home entertainment systems have had systems for allowing their devices to interact for years.  Sony, Bang & Olufsen etc. The TV, DVD player and amplifier can talk to each other, albeit in crude and limited ways.   We've had this for at least 25 years now.  But it hasn't really taken off.  Because every manufacturer only offers interoperability between their own devices.   Which isn't terribly useful as you are likely to have a mish-mash of different devices from different manufacturers bought at different points in time.

Most everyday devices capable of interacting wirelessly exist within closed ecosystems.  Closed proprietary protocols, and when the protocols are indeed "open" they are usually special purpose or they are designed by committees that produce bum-numbingly boring specs that most people would rather not play with.

As far as the hardware is concerned, you would be amazed how cheap processing power has become. Texas Instruments have some microcontrollers that cost $0.25 apiece.  Low power communication hardware for wireless networking are getting ridiculously cheap as well.

Form matters.

If I may go off on a tangent here for a while; the qualities of a standard or a spec matter greatly.  For it to be useful to a big audience it has to be brief, concise and precise.  What it describes also has to be simple.  Any excess fat must be trimmed.  It is extremely important that the spec is written from the point of view of someone who has actually implemented what is described.   SMTP, HTTP and even TCP/IP didn't win because they were the best possible protocols -- they won because it was practical to implement them.  Because they were simple enough.  X.500 and the OSI stack were not, and quite deservedly starved to death from lack of enthusiasm.

If a spec is not usable as a practical blueprint for someone building or implementing the spec, it is not a useful document.

Also, the format matters.  Again, look at the standards that make up the bulk of basic Internet technologies.  They are text files.  Now compare to some of the more unwieldy standards and specs of today that are maintained as big, fat Word documents.  Often horrendously badly formatted Word documents at that.   There is a number of reasons why word processing files are not suitable for authoring specs and standards.  However the minutiae of this is a topic is beyond the scope of this blog posting.

If you want a good example: compare the SAML specs (http://saml.xml.org/saml-specifications) with the OAuth specs (http://tools.ietf.org/html/rfc5849).  No, really, have a look and see which you would rather work on.

Clarification: I think SAML is a good idea.  But it is a pain in the butt to work with because of the massively big, badly written specs.


Still, the main problem seems to have been in the hardware domain.  Anyone and their grandmother tinkers with software.  Not a lot of people play with hardware.  And for good reason.  You need to have a fair bit of basic knowledge to play with hardware.  And to be honest, the hardware mostly isn't all that great.

For instance, I have been playing a bit with mesh-networking and the XBee devices.  They are interesting devices, but to be quite honest, not very easy to work with.  The manufacturer does not really understand the potential of their product and have not paid much attention to making it easy to configure them.  If you want to configure an XBee device you'll have to set aside a few evenings to study the docs.  Then you will need to figure out how to work around their worthless, Windows only configuration program.  It takes a lot of trial and error to figure out how to work around the dysfunctional and confusing configuration software and I would assume that it is probably better to write your own software for this if you plan to spend much time working with it.


However, there is hope.  As advanced microcontrollers have become cheap there are some brilliant solutions available.  Chief among them is the unassuming, but all-important Arduino.  The Arduino is the first wide-spread prototyping platform that makes hardware truly accessible to the masses.  It has three things going for it.  First the hardware:  it is cheap, offers adequate processing power, lots of I/O and the physical interface is exactly what a tinkerer needs.

Second is the software.  While the main IDE used with Arduino is aimed at non-programmers, you can still elect to take a more "traditional" route if you prefer that -- using avr-gcc and traditional C programming tools.  To help beginners (and more experienced hardware hackers) along there is also a large library available for the Arduino.  For doing anything from driving displays to driving servos.

Third, the Arduino has a vibrant community around it.  There is an abundance of forums, howtos and people who are willing to share their knowledge.  If you want to hack hardware using an Arduino you can accomplish a lot of fun and interesting things by participating.

The Arduino isn't important for the consumer in much the same manner as Java, PHP, Linux or MySQL is unimportant to the users of the web.  It is important for the people who innovate, tinker, build and invent.  It significantly lowers the barriers for prototyping hardware solutions and it has adequate I/O and processing capabilities to accomplish interesting things.

Interestingly Google has recognized this and are now using the Arduino as a platform for their Android Open Accessory Development Kit.

Right now the Arduino is one of the most important gadgets on the market.  Because it significantly lowers some important barriers.

Era of talking hardware.

As I said earlier, I do not believe there will be a Web 3.0 in the sense that we will see a marked shift.  There is perhaps a Web 2.5, which is "more better" and the combined fruits of Big Data and machine learning, but I don't see any distinct shifts for the web as such up ahead.

I think the Next Big Things is having gadgets talk to each other and that this is done using technology that is far more available to the masses than is currently the case.   The key to this is simplicity, and in the electronics industry, the winners will be those who realize that to fully exploit this opportunity they have to re-think who their target audience is.


Why talk of "segmentation" frightens me.

When I wish to provoke thought I often state that "there is no such thing as segments - only successful or unsuccessful products" or that "I do not believe in segmentation".

The first statement is, at best, a simplification about how I think about customers and users in relation to products, but it is a simplification that is closer to what I believe than just stating that any market can, or should, be thought of in terms of segments.  One could say that I am rounding off to the nearest significant decimal.

The second statement is largely true.  I think segmentation is counterproductive because it tends to act as a handy excuse for not succeeding in appealing to the user.  But more importantly, I think segmentation is counterproductive because the partitioning of the customer base is often not sufficiently knowledge-based.  Ask for hard numbers and more often than not, you will get the results from a poll or a focus group.  (This is so wrong I cringe at the lack of scientific validity whenever I see the results from polling).  And even when segmenting is informed by at least some data,  people don't always understand what to do with that knowledge.

By observation, not by decree.

Now, I have already admitted that "there are no segments" is not something I truthfully believe in.  But this statement is a useful stand-in for what I really believe.   Because what I really believe may be a bit trickier to understand on an intuitive level.  What I really believe is that segments can only be derived from recent observation.  There are two reasons for this.

The first reason is that merely guessing what segments there are is...well, just guessing.  Of course, there are many forms of guessing.  Some of them are cleverly disguised as science.  You can ask people what segment they are in.  You can ask questions that you think will determine what segment they are in (which might work, rarely does) or you can do what a lot of people do:  recycle random factoids from books or from the web.

Litmus test: If you have names for the segments even before you have data supporting their existence or describing their size or importance (not always the same), you are not being scientific.

(A quick comparison.  Automated news aggregation sites, like Google News, are all about automatic segmenting of news reporting.  Or "clustering" as it is called in search or machine learning nomenclature.  I forget which.  The software is not really pre-configured to know anything about any topic.  It works by ingesting a fair share of the world's newspapers and looks at significant similarities between articles to determine when new themes appear, breaking stories etc. and then clusters these articles together.  It knows nothing about, for instance, ice skating -- but if a significant event were to occur in the ice skating world, it would most likely be able to tell that "something" happened and that these articles belong together.   Even though the system had no way of knowing or anticipating anything about ice skating. These clusters come and go.  Some are short-lived, some are more persistent -- perhaps even permanent.  I see this as the same phenomenon.  Only:  I doubt that most segmentation in market research is done as dilligently)

The second reason is that segmentation changes.  While a given partitioning criterion may remain somewhat stable over shorter periods of time,  there may arise new partition criteria that are suddenly more important.  For instance, Internet access used to be rare and relevant to a small number of people.  In the west today, not having Internet access is more rare than having it used to be.  This happened relatively slowly over a decade or so, yet the advertising business and the content industry failed to catch on early enough and as a result, were late in upping their game.  There was no segment for "Internet users".  Then there was one but it was small. Then it got big.  Then it disappeared because everyone was an Internet user and those who were not...well, marketers usually don't care about cave-dwelling hippies off the grid, now do they.

To sum it up: if segmentation is to be useful it must be based on scientifically sound analysis of reality and one has to understand that it is a dynamic phenomenon and that the rate of change is usually accelerating.   Also it is important to understand that both quantitative and qualitative changes in partitioning occur. (Segmentation is, when we get down to it, a mathematical concept that has been sufficiently dumbed down to be taught in business schools.  But don't tell anyone I said that).

Which brings me to why I willfully mislead people by saying "there are no segments":  if saying so leads to people focusing on the product they want to build rather than flimsy data and naíve analysis of same,  that outcome preferable.

Besides: build a sufficiently brilliant product and it will cause observable change in segmentation.

Product first.

The last statement in the previous section also sums up why I don't think paying too much attention to segments is important.  It limits you in how you think about products and it does so in big and important ways.  Segmentation can be a tool for tweaking existing products, to eke out small incremental gains,  but it isn't a tool that is usable to innovate.  It isn't like they ran the numbers and figured out in the 1920s that "our market research says there should be a market for watches worn on our wrists".  Also there's this old chestnut:
"If I’d asked people what they wanted, they would have asked for a better horse"
(attributed to Henry Ford)
If you pre-constrain a product to fit within a given mold you may miss big opportunities.  You may constrain your creative forces to focus their energy on filling a spec rather than going back and asking fundamental questions. (This is probably why I have yet to find a garlic press that actually works:  they are all just bad copies of each other and the fundamental problem has yet to be solved).

The reason Apple revolutionized mobile phones is because they asked fundamental questions.  Not because they accepted the mold into which the incumbents hammered their products.

In retrospect most people will giggle when you show them the almost perverse degree of segmentation some mobile handset manufacturers wasted their energy on.  But just 5 years ago, the same people would probably think quietly to themselves "wow, they've really covered all the bases here" and be impressed at the diligence with which manufacturers managed to fill every niche and tweak every last bit out of their product.

There are more people thinking about products in more radical ways and with the means to actually act on their ideas at relatively low cost than before.  It would be naive to think that this has no impact on how we model and predict.  The consumer is exposed to more new ideas every year than ever before.

So when I say I don't believe in segmentation, it is because it evokes thoughts of slow, ill informed models that become irrelevant so quickly that they have limited predictive power.  But more importantly, I think they are not a good tool for ensuring nimble strategy.

The when quantitative factors go all hockey-stick it leads to qualitative changes.

(I would have liked to say something about segment size vs segment importance as well, but this blog entry is already too long and rambling so I'll save it for some other time).

Speak, friend, and enter

Ubiquitous and universal single-sign on is never going to happen.  Everyone wants to own the user, or at the very least not relinquish control of the user to possible competitors if they can help it.  This just seems to be a fact of life;  whether informed by reason or fear.   A bit of both I suppose.

This of course means that we need to keep track of gazillions of passwords.  Which is a problem.  A balancing act between convenience and security.  Every so often you are faced with another registration screen that requires you to type in a password.  And what on earth should you type in?

You may have a system for choosing passwords -- in itself a security risk, but still a very common way to cope with the massive number of accounts you have without having to write anything down.  The risk being that someone will figure out your scheme.  The gamble is that a) the scheme won't be self-evident by cursory study of one or a few samples, b) you are not that interesting so why would anyone invest time in cracking your password scheme.

So, you are looking at a form and you need to choose a password.  You type in something and the validation scheme of the site says you can't use that as a password.  This is annoying.  Because it means you have to come up with something that might be harder to remember.  Perhaps you'll even have to write it down to ensure you remember it.

I particularly dislike password validation schemes that require you to enter mixed case characters and digits with a minimum length of N characters.  I have a theory that this does not enlarge the search space for possible passwords:  it is going to severely limit the space.  Why?  Because we're human and it is very likely that we are going to pick a memorable password string that has these properties.

Ask yourself:  what information has mixed case and a number and will be easy to remember?  What are the first 5 password schemes you can think of that uses information you would remember that has these properties?  Next, how much of this information exists in some form in the public space?

I believe the correct response to this question contains at least one expletive.

Unlike in the movies, where the hacker manually types in passwords until he or she succeeds, in the real world you have web crawlers, you have frequency dictionaries, you have oodles of neat software to look for patterns and you have ways of automatically assembling a dossier on your target.  A dossier that can be used to generate password candidates.  Password candidates that can be used to mount a brute force attack.

(Okay, so sometimes it happens like in the movies.  Sarah Palin's mail account was broken into because her security question asked for information that was readily available on her wiki page.  This is why security questions to enable password reset is a Really Bad Idea).

We have to assume that the people who write validation code for websites are not going to be experts in information theory, psychology or cryptography.  This is why I wish people who actually know something about this subject would put together a sort of manual of sound practices in designing password validation and helping people choose sensible passwords.

I know nobody wants to stick their neck out for fear of criticism.  Especially not people with computer security backgrounds since they have the most to lose if they were to write something that just makes things worse (which is a very real possibility).  But I would wish that there at least existed some design guidelines or a brief discussion that programmers could have a look at before adding counterproductive password validation schemes to their websites.   A go-to resource that "everyone" knows about.

Before we part I thought I'd point out that possibly the biggest security risk is right in front of you.  You're looking at it.  Your browser.  It most likely has a cache of a considerable number of passwords that you use to access your most important web sites.

How well guarded do you think that password database is?


Failed library design

In the past week I've spent a considerable amount of time being angry and frustrated with what I view as inept API and library design.

The point of having libraries is to make a given problem domain more accessible, by creating meaningful abstractions that insulate the programmer from some of the underlying complexity, to help the programmer produce correct code and to reduce the workload on the programmer.  

It is important to note that the latter is more about time and effort than the number of lines of code necessary.  It isn't about the amount of typing that has to be done.  This also extends to the documentation and whether it is necessary to spend days reading up on something just to use parts of a library.  If you have to spend a considerable amount of time to understand the design and implementation of a library just to use it in a simple context, the library designer has failed.

Failed library design is counterproductive and uncool.

In the past couple of weeks I have tried to make use of various SAML-related libraries and I have to say that I am thoroughly unimpressed.  In fact, I am angry.  I haven't seen such massive amounts of inept library design and sloppy programming since the last time I had to deal with ... well, certain XML technologies.

It is for good reason I become wary when people tell me they have written infrastructure code that deals with XML because it would seem that this particular niche in programming attracts people who simply are no good at what they do.  Before hiring people with extensive XML-experience on their CV I need to see two things:  I need to see code they have written which they think represents their best effort, and I need to talk to them to understand if they are prone to pointlessly complex library design.

If what I see is along the lines of what you can often find in certain XML libraries, then they can't expect to be hired.  I'm sorry, but I firmly believe that the job of a programmer is to shun complexity and create simplicity wherever possible and I have no desire to work with people who are either too lazy or too dumb to at least aspire to usefulness.

An example.

So let's have a look at just a single line of code (admittedly with a line break inserted by me to make it fit into the blog) from the sort of library I despise.  This is the recommended, idiomatic way you are supposed to create an Assertion.  Or rather, create a builder that can eventually be used to create an assertion:
SAMLObjectBuilder<Assertion> builder
= (SAMLObjectBuilder) builderFactory.getBuilder(Assertion.DEFAULT_ELEMENT_NAME);
The above code tells you a lot about what's in store for you if you should be so foolish to waste time trying to use the OpenSAML library.

So, the programmer wishes to create an Assertion.

In order to do this he first has to get hold of a builderFactory (not shown in the above code).  This in itself is sloppy.  Why do we need a builder factory?  Under what circumstances would an ordinary user need to even be aware of there existing multiple sets of builders for any given type?  This is plumbing sticking out of the walls for no good reason.

Next the programmer has to get a builder for the assertion.  The first thing I find interesting is that the designer fails to abstract away the underlying complexity.  The programmer has to know the element name of the Assertion.  Why?  Isn't the point of having a library to help the user deal with the problem domain in a more abstract manner?   The fact that the Assertion class defines a handy constant for this does not excuse this.  This is a level of detail that the user should not have to worry about.  Besides, you will note that the constant holding the DEFAULT_ELEMENT_NAME has "default" in its name so it isn't like the designer didn't have an intuitive sense that there was an expected default.  A good designer would have thought about this and concluded that if the default element name is what will be used in 99% of the circumstances, then the programmer should not have to type this in.  Ever.

The second interesting thing is this whole SAMLObjectBuilder affair.   To get one you have to hand-crank the creation process with casting and the assigning it to a genericized SAMLObjectBuilder type.  Why?   Would it not have been better to just have an Assertion.Builder type that you simply instantiate and then use?  Why would you first want to have a factory, and then design it so badly you have to cast the results it produces?  In fact there is a AssertionBuilder that extends an AbstractSAMLObjectBuilder that extends an AbstractXMLObjectBuilder which is defined god knows where,  but instantiating it directly isn't the intended way to do it, and if you start to mess with those types then it is going to take you a while, and a few screenfuls of windows with various Java source files, to make any sense of the whole giant mess. There's a lot of structure for structure's sake.

Also, why would you call something a builder when it is in fact a factory?  While some literature may be overly vague about what a builder is, good practice is to use builders as an instantiation helper that ensures the produced object is in the desired state.   Often producing an immutable object -- sometimes to insulate the programmer from ardous initialization.  Assertion has both getters and setters and the builder, near as I can tell, is just used for object creation with the inane buildObject() method.  In other words, the whole builder thing is just another bit of plumbing sticking out of the walls.

If you want a good example of builders you can have a look at the API of code generated by Google Protobuffer compiler.  Here's a typical example of what object construction looks when you use protobuffers:
Timber.LogEvent logEvent = Timber.LogEvent.newBuilder()

In the above code snippet we create a log event builder, set some values, add a nested payload and then construct the needed objects. Understanding how to use this is dead simple.  All generated types have a newBuilder() that produces a correct builder without any silly fluff.  All builders have setters for their values and adders for nested collections.  Setters are chainable.  The build() method validates and produces a finished instance.  Done.  All this takes you about 5 minutes to learn and is so simple you won't have to read the source or the javadoc again for months or years.

It should be noted that we have still only investigated a SINGLE line of idiomatic OpenSAML use, and already we have trodden in a miserable swamp of shoddy design and had to tour the source for answers (which is like peeling an onion:  layer upon layer with little or no substance in any one layer and a lot of crying) -- even before actually looking at how you would accomplish anything of use with this library.

Believe me, it gets a lot worse when you do.

You may ask why I am singling out OpenSAML for criticism.  Well, why not?  I had to pick an example and it might as well be OpenSAML because I have been trying to use it lately -- and have come to the conclusion that it is actually easier, faster and safer to just read the SAML specs and write things from scratch.  It is a pain in the ass, but considerably less risky than trusting heaps of code too messy to read.  Granted, the SAML specs are a dull read and I still have to deal with heaps of terribly designed XML infrastructure, but it beats wasting even more time on an unhelpful library that fails to hide the details anyway.

Besides, it will mean that I do not force those who come after me to take on the design debt of OpenSAML as well.

I am also using OpenSAML to make a point.  It must be okay to point out bad design.  It may not be nice, but then again, wasting my time is also not nice. If we avoid naming and shaming bad design, people will not learn.

Now, if you have worked on OpenSAML I would imagine that you are feeling defensive right now.  Perhaps a bit angry.  You may think that I am unfair because you just did what everyone else did when designing and writing the code. (The idioms found in OpenSAML sure are plentiful in other XML code, though that doesn't make them "good").

If you feel like responding (please don't.  I am not interested in hearing a defense for OpenSAML.  There really is no point in even trying to defend it): before you respond, I think you should sit down and ask yourself if you are really sure that OpenSAML is helpful to anyone but those who are prepared to a) spend a lot of time understanding the library, or b) are utterly comfortable with cutting and pasting code they barely understand to put something together that apparently works.

Then you should ask yourself if what OpenSAML does is really so complicated that the complex, unfriendly, messy API is warranted. (I can give you a hint:  it isn't).

If you think that people should have to spend a lot of time understanding OpenSAML you are wrong and you should probably never write library code ever again.