2010-02-28

Nostalgia

I fired up a virtual machine running Windows tonight.  I wanted to have a look at some windows only CNC software, and since I don't have any machines that run Windows anymore, the way to do it was to create a Windows image and run it on a virtual machine.

Years ago I used Windows for running Photoshop and various music software.   Commercial stuff for which there is no good open source substitute.

Getting a Windows machine to run law-latency audio software reliably is about as easy as convincing Richard Stallmann he is being an annoying turd when he tactlessly, and rather ignorantly, insists on calling Linux "Gnu/Linux".  So I ditched Windows in favor of OSX for running commercial software (and discovered a wonderful world of reasonably priced, non-tacky software in the process -- in addition to being able to run all my open source stuff on the same OS).  Eventually I grew fond enough of OSX to use it as my main workstation OS.  I still have various machines running Linux and FreeBSD, but on the desktop I now run OSX.

In any case, tonight I fired up Windows.

Two things struck me:
  1. Windows is really annoying.  How can people use this OS every day for their work?  It is like an attention-seeking little kid that keeps tugging at your trouser leg and throwing temper tantrums ALL THE FRIGGIN TIME if you try to ignore it!
  2. I had forgotten about what a terrible program Acrobat Reader is.  Holy crap.  How is it even possible for Adobe to so actively sabotage arguably the most important product they have ever had?  Adobe, if you are reading this:  have a look at "Preview" on OSX.  No really, do have a look.
All these memories of how painful Windows was kept coming back to me -- but do you know what was great?  Running Windows in a virtual machine you don't need to shut it down properly.  You can just kill it, not have the VM commit the changes and start anew every time.  That way you can avoid the terrible mess that any Windows system eventually turns into.

Fragmentation

Years ago someone tried to convince me that the USENET would be replaced by blogs, RSS feeds and comments -- because this would be so much better.   While I didn't doubt that the USENET would be greatly diminished, and eventually disappear, as the central arena for public discourse on the net as this newfangled interweb grew in popularity, I never did subscribe to the idea that things would become significantly better just by using blogs and RSS.

I am sad to say that I was right.

In order to keep abreast of all the various discussions or conversations I am having, I have to visit more than a dozen sites.  And those are just the ones I care about right now.  There are a few dozen sites that I have forgotten about.  Where I will never return to read what people have since contributed to some discussion, nor will I ever respond to them.  It isn't because I don't care.  It is just because it is turning into a bloody easter-egg hunt.

There are lots of sites that try to neatly tie the fragmented conversations you have together.  But none of them have succeeded to the degree where the problem is even close to being "solved".   Blogs, Twitter, PhpBB, Facebook, Google Groups etc; it is just one big fragmented mess.  With lots of partial aggregation-systems thrown in just to add to the confusion and fragmentation.


Don't get me wrong; the USENET was not perfect -- in fact I'd have to say that the USENET has been completely irrelevant for many years now because of its many shortcomings.  But it did solve some problems that a lot of people have yet to re-invent properly.  

And that is what bugs me: why do we keep reinventing the wheel and then forget about problems that people at least tried to address 25-30 years ago?

From a technical point of view, Google Wave is probably one of the more promising technologies I've seen.  The problem is that most people do not seem to understand what makes Google Wave a good solution.  Heck, it even took me a few hours of reading up on the underlying technology to see it, and I care about these things.  I kept telling my techie friends to read the specs before dismissing Google Wave.  I told them that it takes a bit of time to get it; the underlying ideas are sound.  The only problem is that last I checked, the client doesn't really work -- and to the user, the client is Google Wave.

In any case, if you miss my participation in some forum where I've posted something and you have replied, it isn't that I'm ignoring you.  It is just that it is such a pain in the ass to follow every place where I've posted something because we've collectively reverted to really, really dumb communication technologies.

2010-02-27

Hands off my search engines!

There have been rumblings about the EU looking into allegations of lack of fairness in the Google search results.  Apparently some web sites that offer their own sort of search services feel they have not been displayed as prominently as they would wish in the search results.

Let me explain what Google does:  Google lives and dies by the percieved quality of their search results.  If Google serves up results that are not helpful to their users: that hurts user satisfaction.  Everything Google does on search is measured against what provides the greatest user experience: "can people find what they are looking for?".

Search is a hard problem.  You have billions of web pages, the user enters a few words and in essence, the search engine has to come up with the 10 pages most relevant to those few bits of "intent" that the user provided.  Sure, a given search will usually have lots of hits, but we count on search engines to place what we are looking for at, or near, the top of the result set. 

If you think that search engines have not been innovating since the interface you see has only changed slightly over the years, you should consider this:  the size of the web has been doubling every so many months.  Also, since the top spot in a search result can have very direct and easily measurable financial consequences, a lot more resources are devoted to gaming the search engine ranking algorithms.

In effect this means that a lot of people are working to place their content in front of you -- whether relevant for what you are looking for or not.  These people are working against you, the user, and the search engines.

For some searches I do, search engines unfortunately give me pretty bad results.  For instance, if I look for a web shop that will sell me some gadget and the top 10 results are dominated by price comparison sites or other aggregators of content, that pretty much sucks.  It isn't what I am looking for.  What I am looking for is, preferably, the best place to buy the gadget or background information that will help me make a decision.  If Google observes that I am not too happy about those search results and then decides to change the ranking to give me results I do want, then that is the right thing to do for them and for me the user.

It is that easy.

I find the idea that we would allow government bureaucrats to dictate how search engines rank their results to be a horrible idea.  First and foremost because it is dangerous to let governments dictate what is "truth".

Second, because these people are so obviously not qualified -- they do not even understand the basics of the problems a search engine must solve.  It is dangerous to even pretend that these people are worth listening to.

Third, the interests of Google and the interests of the user are more aligned than that of spammers and bureaucrats and I can't help but feel that spamming is more honorable than the blind application of non-knowledge to the detriment of society by bureaucrats whose main ambition is to stay employed and preferably retain a considerable expense account to be used liberally in a civilized country with good beer.

One can only hope that this process leads to the education of government bureaucrats though I am not holding my breath.

(disclosure: I've worked for a number of search engine companies, including Google, but I do not work in the search engine industry now.  I have no incentive to side with Google, Bing or any other search engine.  I do however have justified reasons to fear stupidity in government.)

2010-02-26

Thoughts on interfacing with EMC2

The original plan for my CNC machine was to use an Arduino to take care of the timing-sensitive task of delivering pulses to the stepper drivers.  A few things have happened to that plan. 

First off I discovered that the power rating of the EasyDriver 4.2 boards is probably going to be a bit too low. The EasyDriver can deliver 750mA to each phase of the motor.  This was probably enough for the smaller motors I got from Sparkfun, but the motors that are mounted on my CNC rig draw significantly more current than that so I needed somewhat beefier boards.  I can't remember where I got the beefier boards, but they are in the mail by now and can deliver 2A per phase.

Second, I have been skimming the docs for EMC2 and there does not appear to be any easy route to getting the software talking to an Arduino that just takes gcode as input.  The easiest way forward seems to use the parallel port to deliver pulses generated by EMC2 to the stepper drivers.  The good thing about this is that it means I can benefit from the rich functionality in EMC2.  The bad thing about this is that using the parallel port seems to be fraught with its own set of challenges.   From the fact that it makes the system very timing sensitive to stories about ratty parallel port interfaces that produce jitter or just fire off spurious pulses when idle.

I think I read something last night about someone using an Arduino to create a lower level protocol and much simpler software aboard the Arduino -- which interfaces with EMC2.  I would feel more comfortable with having the Arduino interfacing to the stepper drivers.  The parallel ports sound dodgy.  I am hoping to find more information on this.

Also, I need to build a PSU and figure out how I want to control the spindle.  For now I will probably have manual speed control, but eventually I would like to be able to control spindle speed from software.

2010-02-22

Trash talk.

The press likes a good fight and when Steve Jobs talks trash about Adobe and Flash, the press is more than willing to print it.  Of course, there's also the fact that Jobs is right about Flash.

Flash is an embarrassing piece of software that doesn't work for any reasonable definition of the word "works".  For what it does it consumes a staggering amount of resources and it doesn't seem to get any better with each new version.  Of course, users hardly expect Adobe products to really get better over time.  Seriously, they don't, and it is starting to show up in sales numbers. They just grow more features you don't need, and take up more space.

Which is how Adobe squandered their most important product ever:  Adobe Acrobat Reader.  Did you know that the download page for Acrobat Reader used to be the web page with the highest static rank on the web?   You would think that would inspire Adobe to make an effort -- to understand their customers, to make sure their product was the best measured by the sort of metrics the customer cares about.

It is rather telling that I many PDF files during a normal day, yet it has been several years since I used Acrobat Reader to do read PDF files.  I just don't need the hassle of Acrobat Reader.  

Sadly, Adobe seems to be managed by people who should have retired more than a decade ago.  No really, I mean it, they should get a new management team.  They can't possibly be any good at what they do.  Adobe has several dominant, sector-defining products and the demand for the sort of products they make is increasing.  Business should be great -- not "meh".

But back to Steve Jobs.  I think that before he bashes Adobe he should clean up his own back yard first.  And he can start by cleaning up the iTunes application.  What the hell is wrong with the people who do the iTunes app? 

The thing produces the spinning ball of fail all of the time.  Whenever it tries to do more than one thing at a time, the application becomes unresponsive for long periods of time.  Even when it only tries to do one thing it often becomes unresponsive.  Very, very often.  As in "several times every time I try to do something".  Get it?

Steve Jobs: your iTunes app behaves like a first year student's first exercise in concurrent programming.  No, really it does and I am being nice to you.  It is a showcase of concurrency fail.  There is no reason why the GUI thread should be blocked because iTunes is downloading a file or talking to a connected iPod.  None.  Zero.  iTunes doesn't need to lock up because IO takes place.  I have a Mac that is more powerful than most supercomputers of not too long ago -- it does NOT need to mull things over for a good 10 seconds while doing IO.  

Now that the iTunes app is becoming such an important conduit for business at your company, you should take it a bit more seriously than you have up until now.  You do not want to end up like Adobe Acrobat, right?

Less bitching about Flash.  More work on fixing the most obvious problems in iTunes.

2010-02-21

CNC Build

Tonight Ståle and I set out to assemble the CNC machine.  Before starting we had a large pasta meal with all the trimmings.  Then we cleared the table and laid out all the parts.  From time to time I grabbed my camera and took some pictures, so the guy you see in the pictures is Ståle.


We started by assembling the Y-axis, which controls the back and forth movement of the table.  We had to somehow fit all the bearings into some very tight holes and unfortunately I do not have a hydraulic press in my appartment so we had to solve it with a bit of ...uhm...creativity.


Below you can see most of main parts o the Y-axis being assembled.  The good thing about building something from scratch is that it gives you a chance to understand the details of the machine.  For instance it becomes relatively obvious why the anti-backlash nuts are needed.  (From US patent number 4210033: "An anti-backlash nut is disclosed of the type which undergoes translational movement along a screw in response to relative rotational movement between the nut and screw. This anti-backlash nut has oppositely-directed longitudinal flexure members radially biased towards the screw to provide equal drag torque and minimum relative deflection for travel of the nut in either direction.")

 

With the Y-axis assembly done all that was left was bolting on the table.

  

  

Next up was the X-axis.

  

X-axis shaping up nicely.

  

Back-plate of the X-axis in place as well as the front plate for mounting the Z-axis.  Ståle can be seen in the background.

 

Gantry placed on top of the base just to check that everything fits.  Next up was to build the Z-axis.

 

The Z-axis is just the same as the X- and Y-axis, only smaller.

  

When mounting the Z-axis I somehow managed to put the thing upside-down;  with the motor mounts pointing downward.  We quickly rectified the problem and here you can see all axis mounted correctly, with the Z-axis motor mounts facing upward.  (Funny, we didn't drink that many beers...).

 

Finally, the whole thing with stepper motors in place and the gantry bolted onto the base.


Total build time was something like 4 hours, a couple of cans of beer and a can of Red Bull.  Some time was lost due to an Ocicat that needed attention every 10 minutes.

The one thing that is a bit worrying is that the lead screws need a bit more force to turn than I had anticipated, but I see other people worrying about this as well, so I guess it is normal.  Since the X-axis seems to be the axis that needs more torque to be turned I guess we will wire that up first and see if the stepper motor has any problems.

The stepper motors that are fitted to the CNC machine are a bit larger than those I got from Sparkfun.  I guess if these prove to be too weak I'll find some more powerful steppers.

Next up is to hook up the steppers to the stepper drivers and do some simple test runs.  Mostly just jog the steppers along each axis and grease up the lead screws and the rods so everything runs smoothly.  I guess I will have to hack up some software for the Arduino and possibly create some sort of test interface in Processing.

2010-02-14

EasyDriver 4.2 and Stepper

EasyDriver v4.2 hooked up to a stepper-motor.  I have another 2 miniature breadboards to mount the other EasyDriver cards for experimentation, but judging by what I've read they tend to run a bit hot, so when mounting them in a more permanent fashion I need to think about cooling.  Depending on how hot they get I might get away with some cooling fins and a couple of fans,  but I've seen people design elaborate aluminium rails for mounting up to 5 EasyDriver boards and use the rail for dissipating the heat.

I didn't get around to hacking the Arduino this weekend since I had to do a bit of work.  (Besides I spent saturday reading up on, and experimenting with, Grails since that was a more "couch compatible" activity.)

While the goal is still to figure out if i can use the Arduino Mega to implement a gcode interface I suspect that once the CNC rig is built I might get a bit impatient and use one of the parallel port hacks to hook it up to a machine.  I still know too little about the Enhanced Machine Controller (EMC) package so I should probably read up a bit on that and install it on a spare computer I have lying around.

I also need to understand the Atmel microcontroller a bit better.  I have this vague idea of a scheduling-based platform for controlling the steppers, reading serial input and possibly controlling a small status display.  To do that I probably need to experiment quite a bit with timing and see if it is practical to write interrupt handlers.  I could really use some advice from more seasoned Atmel programmers, so if any if you find these things interesting, please do not hesitate to contact me.

There are a lot of people who want a "headless" setup for their CNC machines -- meaning they would like to dump gcode to an SD card and be able to run the gcode without having a computer hooked up.  Possibly with a small filesystem browser on a 2-line LCD so you can select which gcode file to run.  This is not a big priority for me right now, but a cool idea.

2010-02-13

Snack

- "bring me some snacks", she said
- "well, how about carrots and dip?"
- "you know I can't have carrots, you dork" (she has braces)

[slight pause]

- "how about I grate them and mix them in the dip for you?"

Hey, it was worth a shot.

2010-02-12

This 'Big' Is Bigger Than That 'Big'

Most people who have known me for any length of time know that I have a particular fondness for IM systems.  Unlike most, I don't think about actual people chatting when I think of IM systems -- I think of it as an infrastructure which provides a set of communication primitives.


IM seen as communication primitives.

Primitives that can conveniently be used for person-to-person communication, but which can also be leveraged by software agents that need to be able to communicate asynchronously,  from behind NAT'ed connections (or non-routable mobile IPs) and without knowing or caring where the other party is.  Having a concept of agent state, however primitive, is also a useful feature.  As are various forms of group communication primitives.

If you are the curious type you have probably set up a packet sniffer to figure out what your Android telephone is doing when it boots up.  I have to say that I wasn't surprised when I saw that it connects to what appears to be an IM service at Google.  I'll leave it as an exercise to the reader to figure out what this connection is doing.


How big is big?

Earlier tonight I was looking for information on how scalable one can expect ejabberd to be.  I did some searches and skimmed through some articles, blog posts and mailing list threads.  This reminded me that people have wildly different views on what magnitude they assign to words like big and scalable.  

For a lot of uses an IM system can be said to be big if it can handle what I call an "enterprise size problem".   Enterprises typically have a few thousand users.  Larger enterprises may approach the 100k mark.  For those uses, being able to handle 10k simultaneous users on a single node is a Big Deal.

Sadly, for "internet scale problems" 10k simultaneous users isn't even a credible test setup for an IM system.   "Big" starts at 100M registered users and some double-digit percentage of those being online and chatty.  Internet scale problems are usually addressed by architectures that are distributed -- often by having lots of moderately powerful servers in a datacenter.

If you can't handle 100M users affordably and you have no reason to suspect that your system is physically able to scale to 10x that with not-too-far-off-linear increase in cost, you are are doomed.  No point in even playing.  Go home.  Take up knitting or crochet instead.

My cursory examination of what has been said about ejabberd scalability suggests that it is probably fine for the enterprise, but it sounds too expensive to scale it to Internet scale problems.   Even worse, some other XMPP implementations I read about have even worse performance -- which is hardly a surprise when you take into account that there are still weirdos who dedicate an OS-thread to handling each connection -- in 2010.  Not even IRC (which has scalability problems by the bucketload) has servers that are this inefficiently designed.

This means that if you are in the business of providing an IM-infrastructure (XMPP or otherwise), you are most likely going to have to roll your own.

The really big IM systems I've glanced at run on relatively large farms of servers.  There is generally more than one type of server node.  Some handle connections, some track state, handle the roster, some do message routing, some even have persistent queueing in case the client is currently offline etc.  The architectures vary quite a bit, but the common theme is that it isn't as simple as just writing a IM server and firing up N instances of it to handle M users.  There is some division of responsibilities since you are not solving a problem you are solving a set of problems which are very different.

Quite recently various notes on how Facebook put together an IM system have surfaced on the net.  (Perhaps notable for their polyglot approach to the problem and for introducing Erlang into the vocabulary of people who follow these things semi-consciously).  The Facebook system sounds interesting and I am impressed at how quickly they were able to deploy it.

One last rant before I go to bed:

For XMPP the protocol is an issue.  Despite what the XML camp claims, XML is slow and clumsy if you need to parse a lot of it and you have to manage cost.  People who claim otherwise usually fall into one of two camps: people who have never implemented an XML parser and people who have only ever implemented an XML parser and whose basis for comparison is other XML parsers or other inefficient forms of serialization.

I'm sorry, but if you think XML is a performant technology I'm going to have to suspect you of not being a very bright person.  Please don't bother commenting.  I am not interested.

My main beef with XMPP is that it uses XML for framing:  that is for delineating the boundaries of a single message.  This was a big mistake and I was told (by people who were close to the standards process) that it was on the verge of being rectified, but never made it into the standard for various reasons.

In practical terms this means that a performant implementation of an XMPP connection server needs to have two XML parser implementations -- one that merely figures out the framing and one that actually parses the payload or some variation over this theme (you could do some shallow parsing in the "framing parser" as well, but you get the basic idea).

The reason is that you really do not want to have a big, stateful, fully featured XML parser sucking on the end of the network socket that the user is connected to.  If at all, you want to feed the message to the big hairy all-bells-and-whistles-standards-compliant XML parser only once you know you have a complete message.

This is further complicated by the fact that for languages that do not provide a good threading-abstraction that avoids mapping threads of execution to OS threads, you usually do connection multiplexing. (Which is one of the reasons Erlang is an attractive choice for a connection server). Even when you do have a good threading abstraction, you still care about the per-connection cost.   Having 10.000 connections with a big leaky XML parser allocated to each ... well, you see the problem.

Internally, large IM systems don't use XML.  They usually employ a more efficient serialization (and few inter-node connections).  The XMPP in full XML is mostly useful for integrating with user-agents or for federation -- not between nodes in a large IM system.

Of course, there is nothing preventing you from offering more compact and efficient protocols to clients,  but this is perhaps something you would want to do only in more specialized cases.  (For instance when communicating over mobile broadband ;-)).

The important part of XMPP isn't the client-server protocol.  It is the set of primitives offered.

Oh, and if there was any doubt:  if you are building a large IM system then I think it should be based on XMPP -- which basically means you have to support the primitives of XMPP in your architecture and you have to support the standardized protocols when talking to the outside world.

2010-02-11

Apache Axis of Evil

I passionately hate Apache Axis. 

Today's fun and game center around the fact that the  WSDL to Java code generator seems to output different code on the same machine depending on whether it runs from my home directory or as part of the Hudson build.   During the Hudson build this gets generated:
public class ServiceException  implements java.io.Serializable
whereas when I do it on my home directory (or on my laptop or my other workstation at home) it produces:
public class ServiceException  extends org.apache.axis.AxisFault  implements java.io.Serializable
And this is for the exact same J2SE version, the exact same version of all dependencies etc.  What on earth is going on!?

2010-02-10

One small step for man.


I am currently building a CNC mill -- one small step at a time. A lot of people ask me what I need a CNC mill for: I don't, but I thought it might be a fun project to learn about controlling machinery, programming microcontrollers and learning a bit of electronics. If I get a working CNC mill out of it: great! If I am able to build something that works with any reasonable precision it means I'll be able to mill stuff that I can draw -- which might come in handy for my other hobbies.

It also struck me that if I replace the milling head with what is in practice a glorified glue gun I can probably find some way to turn the thing into a 3D printer, but we'll investigate that opportunity if and when I get a CNC machine working properly.

Yesterday I finally hooked up my EasyDriver v4 stepper motor driver to my Arduino Mega. I soldered riser pins to the connections on the EasyDriver so I can mount it on a breadboard and access all the pins that are broken out. I used the pictures from Daniel Thompson's tutorial on hooking up an Arduino to an older version of the EasyDriver plus some other tutorials as a guide.

After writing a short test program to generate pulses for the stepper and setting the direction pin I got the motor moving. Before hooking it up to the 12 volt regulated power-supply from which the stepper should be driven. Hmm, that's odd. Either I have done something wrong or I didn't understand how the EasyDriver works. It can't be good if the stepper is driven from the Arduino power.

In any case, from here on I need to be a bit more structured and keep proper notes of what I am doing. I should probably use Fritzing to document how things are hooked up so I can get some input on what I am doing. After all, I am a software person and I really know very little about electronics.

I did some experiments with the stepper to see how it behaved. From Thompson's tutorial I gathered that if you pulse the motor more frequently than every 200 microseconds it will stall -- which I verified. I just halved the period and it did indeed stall. The code to perform nSteps steps looks something like this:

for (i = 0; i < nSteps; i++) {
digitalWrite(stepPinX, LOW);
digitalWrite(stepPinX, HIGH);
delayMicroseconds(200);
}

It can't possibly be that simple :-).

I am not entirely convinced that a constant pulse frequency is what you would want to feed the motor in all situations. Since we are talking about physical objects that have inertia I would assume that you need to account for acceleration and thus have a way to ramp up and ramp down the pulse frequency. I also suspect that the motor will be more prone to stall at low revs or from a standstill -- meaning I can probably pulse it faster than 5kHz once it is up to speed.

I hope that the books I have ordered will contain some answers.

My current assumption is that I should use the Arduino to implement some sort of high level interface (G-code?) which can be reachable via a serial connection. I have seen people use the parallel port of their computers to pulse the stepper drivers, so I could of course go down that route. However, using a serial connection and offloading the work to the Arduino seems much more elegant.

My short term plan:
  • Implement an asynchronous serial line text protocol interface that can be used to export a command interface from the Arduino. I've seen a lot of examples where people have loops with delays in them for reading the serial input. If I am going to use the Arduino to control multiple steppers in real time I do not want to block while waiting for IO.
  • Figure out what I need to move development of the software to GCC and Emacs. The Arduino software package is nice for playing around, but I would rather use tools that I am more familiar with for more serious development.
  • Familiarize myself with the G-code protocols and figure out if there are open source implementations that I can study and possibly re-use.
  • I also suspect I have to implement some sort of scheduler so I can translate high level commands into precisely timed outputs, queue them up and have the scheduler execute them with correct timing. Since I have no prior experience with programming microcontrollers I guess this could get hairy. I probably need to build a test rig (using a second Arduino or a logic analyzer) to run tests.
  • Figure out what high level primitives I have to implement myself. For instance: does G-code imply that I need to be able to calculate and execute motion along arcs? How painful is it going to be to do this in a numerically sound way on the microcontroller?

Anyway. I have a lot of learning left to do. What great fun! :-).