Quantcast
Channel: Hacker News 50
Viewing all 9433 articles
Browse latest View live

Nightwatch.js


The Magic of strace - Chad Fowler

$
0
0

Comments:"The Magic of strace - Chad Fowler"

URL:http://chadfowler.com/blog/2014/01/26/the-magic-of-strace/


Early in my career, a co-worker and I were flown from Memphis to Orlando to try to help end a multi-day outage of our company’s corporate Lotus Domino server. The team in Orlando had been fire-fighting for days and had gotten nowhere. I’m not sure why they thought my co-worker and I could help. We didn’t know anything at all about Lotus Domino. But it ran on UNIX and we were pretty good with UNIX. I guess they were desperate.

Lotus Domino was a closed-source black box of a “groupware” server. I can’t remember the exact problem, but it had something to do with files not being properly accessible in its database, and apparently even the escalated support from Lotus was unable to solve the problem.

This was one of the most profoundly educational experiences of my career. It’s when I learned what may be for me the most important UNIX tool to date: strace.*

Nowadays I probably use strace or some equivalent almost every day I work as a programmer and/or systems engineer. In this post I’ll explain why and how and hopefully show you some tips on how to get the most out of this powerful little tool.

What is strace?

strace is a command line tool which lets you trace the system calls and signals received by a given process and its children. You can either use it to start a program or you can attach to an existing program to trace the calls the program makes to the system. Let’s look at a quick example. Say we have a simple C program like this:

It doesn’t do much. It just prints “Hi” to the screen”. After compiling the program, we can run it with strace to see all of the system calls the program makes:

To start the program, we invoke strace and pass the program name (and any parameters we want to pass to the program). In this case we also passed the “-s” flag to strace telling it the maximum string size we want it to print. This is helpful for showing expanded function arguments. Here we pass 2000, which is abitrarily “enough” to see everything we need to see in this program. The default is 32, which in my experience means we’ll almost definitely miss information we care about in the trace. We also use the “-f” flag, which tells strace to follow any children of the program we execute. In this case there are no children, but when using strace regularly it’s probably good to get into a habit of following a process’s children so you can see everything that happens when you execute the program.

After the invocation, we see a rather verbose log of every system call. To some (most?), this might look like gibberish. But, even if you have almost no idea what you’re looking at, you can quickly spot some useful pieces of information.

The first line of the trace shows a call to execve(). Unsurprisingly, execve()’s job is to run a program. It accepts the path to the program, any arguments as an array, and a list of environment variables to set for the program (which are ommitted from the output since they’d be so noisey).

The last two lines contain another recognizable sequence. First you see a call to write() with our C program’s string “hi\n”. The first argument to write() is the file descriptor to write to. In this case it’s “1”, which is the process’s standard output stream. After the write call (which looks garbled because the actual write to standard out showed up along with the strace output), the program calls exit_group(). This function acts like exit() but exits all threads in a process.

What are all these calls between execve() and write()? They take care of connecting all of the standard C libraries before the main program starts. They basically set up the runtime. If you look at them carefully you’ll see that they walk through a series of possible files, check to see if they are accessible, open them, map them into a memory location, and close them.

An important hint: every one of these functions is documented and available to read about using man pages. If you don’t know what mmap() does, just type “man mmap” and find out.

Try running through every function in this strace output and learn what each line does. It’s easier than it looks!

Tracing a running, real-world process

Most of the time when I need a tool like strace, it’s because there’s a process that’s already running but not acting right. Under normal circumstances these processes can be extremely hard to troubleshoot since they tend to be started by the init system, so you can only use logs and other external indicators to inspect them.

strace makes it easy to look at the inner workings of an already running process. Let’s say we have a problem with Ruby unicorn processes crashing in production and we’re unable to see anything useful in the process’s logs. We can connect to the process using strace’s “-p” flag. Since a real production web service is likely to generate a lot of activity, we’ll also use the “-o” flag to have strace send its output to a log file:

After only a few seconds, this log file has about 9000 lines. Here’s a portion of it that contains some potentially interesting snippets:

We won’t go through every line, but let’s look at the first one. It’s a call to select(). Programs use select() to monitor file descriptors on a system for activity. According to the select() man page, its second argument is a list of file descriptors to watch for read activity. select() will block for a configurable period of time waiting for activity on the supplied file descriptors, after which it returns with the number of descriptors on which activity took place.

Ignoring the other parameters for now, we can see that this call to select() watches file descriptors 14 and 15 for read activity and (on the far right of the trace output) returns with the value “1”. strace adds extra info to some system calls, so we can see not only the raw result here but also which file descriptor had activity (14). Sometimes we just want the raw return value of a system call. To get this, you can pass an extra flag to strace: “-e raw=select”. This tells strace to show raw data for calls to select().

So, what are file descriptors 14 and 15? Without that info this isn’t very useful. Let’s use the lsof command to find out:

In the 4th column of lsof’s output, labeled “FD”, we can see the file descriptor numbers for the program. Aha! 14 and 15 are the TCP and UNIX socket ports that Unicorn is listening on. It makes sense, then, that the process would be using select() to wait for activity. And in this case we got some traffic on the UNIX socket (file descriptor 14).

Next, we see a series of calls which try to accept the incoming connection on the UNIX socket but return with EAGAIN. This normal behavior in a multi-processing network server. The process goes back and waits for more incoming data and tries again.

Finally, a call to accept4() returns a file descriptor (13) with no error. It’s time to process a request! First the process checks info on the file descriptor using fstat(). The second argument to fstat() is a “stat” struct which the function fills with data. Here you can see its mode (S_IFSOCK) and the size (which shows 0 since this isn’t a regular file). After presumably seeing that all is as expected with the socket’s file descriptor, the process receives data from the socket using recvfrom().

Here’s where things get interesting

Like fstat(), recvfrom()’s first parameter is the file descriptor to receive data from and its second is a buffer to fill with that data. Here’s where things get really interesting when you’re trying to debug a problem: You can see the full HTTP request that has been sent to this web server process! Here it is, expanded for readability:

This can be extremely helpful when you’re trying to troubleshoot a process you don’t have much control over. The return value of the recvfrom() call indicates the number of bytes received by the call (167). Now it’s time to respond.

The process first uses ppoll to ask the system to tell it when it would be OK to write to this socket. ppoll() takes a list of file descriptors and events to poll for. In this case the process has asked to be notified when writing to the socket would not block (POLLOUT). After being notified, it writes the beginning of the HTTP response header using write().

Next we can see the Unicorn process’s internal routing behavior at work. It uses stat() to see if there is a physical file on the file system which it could serve at the requested address. stat() returns ENOENT, indicating no file at that path, so the process continues executing code. This is the typical behavior for static file-based caching on Rails systems. First look for a static file that would satisfy the request, then move on to executing code.

As a final glimpse into the workings of this particular process, we see a SQL query being sent to file descriptor 16. Reviewing our lsof output above, we can see that file descriptor 16 is a TCP connection to another host on that host’s postgresql port (this number to name mapping is configured in /etc/services in case you’re curious). The process uses sendto() to send the formatted SQL query to the postgresql server. The third argument is the message’s size and the fourth is a flag–MSG_NOSIGNAL in this case–which tells the operating system not to interrupt it with a SIGPIPE signal if there is a problem with the remote connection.

The process then uses the poll() function to wait for either read or error activity on the socket and, on receiving read activity, receives the postgresql server’s response using recvfrom().

We skipped over a few details here, but hopefully you can see how with a combination of strace, lsof, and system man pages it’s possible to understand the details of what a program is doing under the covers.

What’s “normal”?

Sometimes we just want to get an overview of what a process is doing. I first remember having this need when troubleshooting a broken off the shelf supply chain management application written in “Enterprise Java” in the late 90s. It worked for some time and then suddenly under some unknown circumstance at a specific time of day it would start refusing connections. We had no source code, and we suffered from the typical level of quality you get from an enterprise support contract (i.e. we solved all problems ourselves). So I decided to try to compare “normal” behavior with the behavior when things went wrong.

I did this by regularly sampling system call activity for the processes and then compared those normal samples with the activity when the process was in trouble. I don’t remember the exact outcome that time, but it’s a trick I’ve used ever since. Until recently I would always write scripts to run strace, capture the output, and parse it into an aggregate. Then I discovered that strace can do this for me.

Let’s take a look at a unicorn process again:

Here we use the “-c” flag to tell strace to count the system calls it observes. When running strace with the “-c” flag, we have to let it run for the desired amount of time and then interrupt the process (ctrl-c) to see the accumulated data. The output is pretty self-explanatory.

I’m currently writing at 7am about a system that is used during working hours, so the unicorn process I’m tracing is mostly inactive. If you read through the previous trace, you shouldn’t be surprised by the data. Unicorn spent most of its time in select(). We know now that it uses select() to wait for incoming connections. So, this process spent almost all of its time waiting for someone to ask it to do something. Makes sense.

We can also see that accept4() returned a relatively high number of errors. As we saw in the above example, accept4() will routinely receive the EAGAIN error and go back into the select() call to wait for another connection.

The resulting function list is also a nice to-to list you could use to brush up on your C system calls. Add them to your to-do list and go through and read about one per day until you understand each of them. If you do this, the next time you’re tracing a Unicorn process under duress in production you’ll be much more fluent in its language of system calls.

Finding out what’s slow

We’ll wrap up by looking at how strace can help us determine the cause of performance problems in our systems. We recently used this at work to uncover one of those really gremlin-like problems where random, seemingly unrelated components of our distributed system degraded in performance and we had no idea why. One of our databases slowed down. Some of our web requests slowed down. And finally, using sudo got slow.

That was what finally gave us a clue. We used strace to trace the execution of sudo and to time each system call sudo made. We finally discovered that the slowness was in a log statement! We had apparently misconfigured a syslog server without realizing it and all of the processes which were talking to that server had mysteriously slowed down.

Let’s take a look at a simple example of using strace to track down performance problems in processes. Imagine we had a program with the following C source and we wanted to figure out why it takes over 2 seconds to execute:

Glancing through the code, there’s nothing obvious that sticks out (heh), so we’ll turn to strace for the answer. We use strace’s “-T” flag to ask it to time the execution of each system call during tracing. Here’s the output:

As you can see, strace has included a final column with the execution time (in seconds) for each traced call. Following the trace line by line, we see that almost every call was incredibly fast until we finally reach a call that took more than 2 seconds! Mystery solved. Apparently something in our programming is calling nanosleep() with an argument of 2 seconds.

In a real-world trace with a lot more output we could save the data to a file, sort through it on the last column, and track down the offending function calls.

There’s more!

strace is an amazingly rich tool. I’ve only covered a few of the options that I use most frequently, but it can do a lot more. Check out the strace man page to browse some of the other ways you can manipulate its behavior. Especially be sure to look at the uses of “-e”, which accepts expressions allowing you to filter traces or change how traces are done.

If you read through this far and you didn’t know what all these function calls meant, don’t be alarmed. Neither did I at some point. I learned most of this stuff by tracing broken programs and reading man pages. If you program in a UNIX environment, I encourage you to pick a program you care about and strace its execution. Learn what’s happening at the system level. Read the man pages. Explore. It’s fun and you’ll understand your own work even better than before.

  • Actually I’m simplifying a bit. We were on Solaris, and the equivalent tool there was truss. It’s basically the same. You get the point.

Bluetooth ZX Spectrum: Recreating the Sinclair ZX Spectrum by Elite Systems Ltd » Comments — Kickstarter

Candy Crush gets trolled by developers making games with the word "candy" in them.

$
0
0

Comments:"Candy Crush gets trolled by developers making games with the word "candy" in them."

URL:http://www.slate.com/blogs/future_tense/2014/01/30/candy_crush_gets_trolled_by_developers_making_games_with_the_word_candy.html


When someone trademarks something dumb there's only one thing to do ...Photo from Candy Jam.

What if a game developer was so concerned about protecting its viral iOS hit that it trademarked the word candy in Europe, and then succeeded in getting Apple to enforce the trademark? And what if other game developers were so annoyed by the situation that they launched a troll campaign to design and submit as many games as they could to Apple that included the word candy? Well, if all of that actually happened it would be hilarious and amazing. And it is happening right now, so be excited.

Candy Crush developer King succeeded in trademarking candy in Europe and is only a few weeks away from getting the same rights in the United States. But a group of plucky (and snarky) game developers has decided to fight the man and protest by submitting as many "candy" games as possible to Apple. The Candy Jamwebsite explains that the game-making blitz is in protest of both the trademarking system in general and the allegedly reckless money-making on the part of King specifically. The submission period ends Feb. 3.

Developers have submitted more than 100 games already with names like Shew It Down: Candy Crap Saga, CanDieCanDieCanDie, ThisGameIsNotAboutCandy, CANDY on the EDGE, Candy Escape Goat Saga, Candy Crush SEGA, and Don't Let the Candies Crush You, according to Pocket-lint. (There's a bonus for incorporating the words scroll, memory, saga, and apple.)

The Candy Jam website says that the initiative is happening "because trademarking common names is ridiculous and because it gives us an occasion to make another game jam." And Laurent Raymond, a co-founder of Candy Jam, told the Escapist, "While there seems to be some kind of consensus about the need for a company to protect blatant ripoffs of game names and/or game aspects, I think that the fact that it was done with so little finesse is what gathered people in [opposition]."

Check out the #candyjam hashtag for more trolly fun and general galivanting about.

Article 28

Boeing #12 ✈ FlightAware

Arduino Blog » Blog Archive » An award-winning LCD with an Arduino built-in (using only 2 pins!)

$
0
0

Comments:"Arduino Blog » Blog Archive » An award-winning LCD with an Arduino built-in (using only 2 pins!)"

URL:http://blog.arduino.cc/2014/01/30/an-award-winning-lcd/


An award-winning LCD with an Arduino built-in (using only 2 pins!)

Zoe RomanoJanuary 30th, 2014

 

We are happy to announce a new entry in the Arduino At Heart program. It’s called arLCD by EarthMake, an US based company with the mission to introduce ezLCD technology produced by its sister company EarthLCD to the Maker Market through special products.

The new Arduino at Heart is not just an LCD and you should not confuse it with a snails pace 2.8 LCD shield that uses almost all your I/O pins!

arLCD is a full smart ezLCD GPU with the Arduino Uno R3 on the same PCB in a thin, easy to integrate package with a panel mount bezel available in the near future.

The 3.5 has 64% more display area than a 2.8 LCD. The arLCD combines the Arduino and the award winning ezLCD into a single product, ready to accept all Arduino compatible shields. The arLCD can be used in many applications such as thermostat control, lighting controls, home security, audio control, water level gauge, robotics, operational control, and button switches.

In the video demo you can see how it works. Jazz shows us how to switch screens and display different programs. The first sketch is an design tool for choosing the colors for the screen layout, the second app explores a thermostat prototyped on a breadboard using a thermistor to read the current temperature and turn on/off an led.

Want to learn more? Take a look at the documentation, download the library and then check the introductory video below:

 

 

If you’re interested add your email on this page and get notified when available in the Arduino Store.

This entry was posted by Zoe Romano on Thursday, January 30th, 2014 and is filed under arduino, ArduinoAtHeart. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Moz Dumps Amazon Web Services, Citing Expense and 'Lacking' Service | Xconomy

$
0
0

Comments:" Moz Dumps Amazon Web Services, Citing Expense and 'Lacking' Service | Xconomy "

URL:http://www.xconomy.com/seattle/2014/01/30/moz-dumps-amazon-web-services-citing-expense-and-lacking-service/


Moz Dumps Amazon Web Services, Citing Expense and ‘Lacking’ Service

Seattle marketing technology company Moz had a worse-than-expected 2013 in terms of profitability and products. But what really jumped out at me in the privately held company’s startlingly frank review of the year was new CEO Sarah Bird’s blunt criticism of Amazon Web Services (AWS), which she says the company is leaving for reasons of cost, product stability, and service.

“We create a lot of our own data at Moz, and it takes a lot of computing power,” Bird says in her 2013 Year in Review post. “Over the years, we’ve spent many small fortunes at Amazon Web Services. It was killing our margins and adding to product instability. Adding insult to injury, we’ve found the service… lacking.”

An AWS representative declined to comment.

Moz had a tumultuous year. CEO Rand Fishkin gave up the top executive job. The company changed its name in May from SEOmoz, reflecting an Internet marketing landscape in which the practice of search engine optimization is merging with the broader category of inbound marketing. Revenues fell short of expectations.

The company’s inbound marketing product, Moz Analytics, “launched” over a rocky nine-month period that’s just wrapping up. The company generated gross revenue of $29.3 million last year, up 33 percent, which Bird described as “off-plan performance.” Sales had doubled in each of the prior two years.

She attributes the slower growth to delays, bugs, and and a lack of certain features as Moz Analytics was rolled out. The new offering was designed to upgrade a previous product with a loyal following. Bottom line, Moz had a $5.7 million loss in 2013 before interest, taxes, depreciation, and amortization. It didn’t provide a similar figure last year, but had been profitable.

“We knew we were going to burn in 2013,” Bird writes. “That’s why we took the $18 [million] Series B” funding round in 2012, led by Brad Feld’s Foundry Group with participation from earlier investor Ignition Partners. “This is a bigger loss than we planned on, though, and we’re disappointed we missed our goals.”

Bird says Moz has adequate cash to continue growing and expects to return to profitability by the third quarter.

You should read Bird’s entire post if you’re following the company or the inbound marketing business closely. (Bird, by the way, was the company’s chief operating officer until earlier this year when she formally took over as CEO from co-founder Rand Fishkin. Fishkin announced the transition in December and said 2013 had been harder on him personally than almost any year in the company’s history, mainly “due to the challenges of scale.”)

Bird says part of Moz’ plan to return to profitability includes a transition away from AWS to its own private cloud. That strategy goes against the narrative we’ve been hearing for the last few years about startups, who have built businesses on commodity computing power rented on a pay-as-you-go basis from big cloud vendors like AWS. The idea is that the public cloud model is cheaper, more efficient, and less risky than buying and maintaining private computing infrastructure. Lately, even big enterprises have gotten into the act, running more of their operations on public clouds from Amazon, Microsoft, and other providers. Moz is just a single example, but it’s one backed up with the kind of financial details we don’t often see from privately held startups.

Moz spent almost $2.4 million on cloud services—the “vast majority” on AWS—in 2011, or 21 percent of that year’s revenue; $6.5 million in 2012, 30 percent of revenue; and $7.2 million last year, 25 percent of revenue.

That got the company’s attention, so it embarked on a plan in 2012 to build its own cloud in datacenters in Virginia, Washington, and Texas.

“This was a big bet with over $4 million in capital lease obligations on the line, and the good news is that it’s starting to pay off,” Bird says. “On a cash basis, we spent $6.2 million at Amazon Web Services, and a mere $2.8 million on our own data centers. The business impact is profound. We’re spending less and have improved reliability and efficiency.

“Our gross profit margin had eroded to ~64%, and as of December, it’s approaching 74%. We’re shooting for 80+%, and I know we’ll get there in 2014.”

Time will tell if Moz is the exception that proves the public cloud rule, or an early indicator that public clouds aren’t the best fit for everyone. Given Moz’ penchant for transparency, it’s reasonable to hope we’ll hear more from them about this transition.

Benjamin Romano is a senior editor for Xconomy in Seattle. Email him at bromano [at] xconomy.com. Follow @bromano


Equidate Launches A Secondary Market For Early Startup Employees To Sell Shares | TechCrunch

$
0
0

Comments:"Equidate Launches A Secondary Market For Early Startup Employees To Sell Shares | TechCrunch"

URL:http://techcrunch.com/2014/01/30/equidate/


It was once a rare practice, but employees are now finding more ways to unload vested shares in their startups along the way.

While employers have typically tried to control these sales, a new marketplace called Equidate is opening up that will let employees sell equity with or without the startup’s consent (although Equidate would prefer to collaborate with employers).

Over the past decade, many companies like Facebook have elected to wait longer before going public. That meant that longtime employees wound up with their wealth mostly tied up in the stock of their companies with few options to diversify their holdings. At the same time, certain investors wanted access to a growing pool of pre-IPO tech companies.

So companies like New York-based SecondMarket cropped up. They have helped facilitate employee share sales for privately-held companies like SurveyMonkey, which raised about $800 million in January of last year.

Equidate’s critique of SecondMarket’s model is that if you are an employee that wants to sell shares, you have to do it through your company.

Two of Equidate’s co-founders, Sohail Prasad and Samvit Ramadurgam, saw that their friends in high-growth tech companies felt stuck financially. “It’s difficult if you want to sell shares as an individual,” said Prasad, who was previously a product manager at Zynga and an early employee at Chartboost. (But these restrictions also exist because as secondary sales have become more popular, companies have also wanted control. They want to manage the flow of private information of their financial performance and they want to know who their shareholders are.)

So what Equidate has done is that they’ve created contracts tied to the value of an employee’s shares, which have to be vested and owned by them. (Employees can’t participate if they just have options or if they have restricted stock units.)

“It’s similar to a collateralized loan. No shares are trading hands,” Prasad said. Prasad said that an Equidate contract allows an investor to buy rights to the economic upside of a share, while avoiding the legal hoops a company has to go through when it’s adding extra shareholders to its cap table.

Gil Silberman, Equidate’s other co-founder, created the contracts after working as a lawyer with companies like LinkedIn, Craigslist and OpenTable. A fourth team member, Andrea Lamari, works alongside Gil Silberman in a business development role.

They’re launching with four companies on the market including Dropbox, BitTorrent, Chartboost and Buzzfeed. They would like to bring more Series B stage companies or so onto the platform, which means they’d sit in between early-stage solutions like Funders Club and then big late-stage rounds.

For now, Equidate will only allow accredited investors, who either have a net worth of more than $1 million or make at least $200,000 a year, to participate.

The four-person company hasn’t shared any details on how much it has raised to date or who its investors are.

Zynga Lays Off 314 Employees, Or 15% Of Its Workforce | TechCrunch

$
0
0

Comments:"Zynga Lays Off 314 Employees, Or 15% Of Its Workforce | TechCrunch"

URL:http://techcrunch.com/2014/01/30/zynga-layoffs-2/


Paired with the news of a big half-billion-dollar acquisition, Zynga is also laying off about 15 percent of its workforce, or about 314 employees.

This is part of a cost-reduction plan that is supposed to generate $33 million to $35 million in savings this year, excluding a $15 million to $17 million restructuring charge.

In an interview today, CEO Don Mattrick said these jobs would mostly come out of “infrastructure” areas and wouldn’t involve shutting down any individual studios.

Zynga has roughly 2,000 employees at a time when better-performing competitors lack anywhere near the same kind of headcount. Supercell, which sold half of itself for $1.53 billion last fall to Japanese carrier Softbank, currently has about 130 employees and was producing just shy of $200 million a quarter in revenue in the beginning of last year.

Since Mattrick took over the company from founding CEO Mark Pincus, Zynga has engaged in a series of layoffs, cut out middle layers of management and shut down poorly performing games. Last summer, the company let go of about 520 people, or 18 percent of its workforce.

Microsoft Said to Be Preparing to Make Satya Nadella CEO - Bloomberg

$
0
0

Comments:"Microsoft Said to Be Preparing to Make Satya Nadella CEO - Bloomberg "

URL:http://www.bloomberg.com/news/2014-01-30/microsoft-said-to-be-preparing-to-make-satya-nadella-ceo.html


Microsoft Corp. (MSFT)’s board is preparing to make Satya Nadella, the company’s enterprise and cloud chief, chief executive officer and is discussing replacing Bill Gates as chairman, according to people with knowledge of the process.

One person the board is considering to take the place of co-founder Gates as chairman is Microsoft lead independent director John Thompson, said the people, who asked not to be identified because the process is private. Even if Gates steps down as chairman, he may be more involved in the company, said two people familiar with the matter, particularly in areas like product development.

Nadella, 46, emerged as one of the stronger candidates to replace departing CEO Steve Ballmer, 57, weeks ago, people familiar with the search have said. The plans aren’t final, said the people.

The leadership changes would take place at a crucial point as Microsoft moves away from its roots as a software maker to focus on hardware and Internet-based services. Rivals such as Apple Inc. have shifted the technology landscape away from Microsoft’s mainstay of personal computers to mobile devices. Ballmer, who said he would retire by August, last year revamped Microsoft’s organizational structure and agreed to buy Nokia Oyj’s handset business for $7.2 billion.

Two Decades

Over the past decade, Microsoft’s stock has gained 74 percent including dividends, compared with a 93 percent rise in the Standard & Poor’s 500 Index. The shares rose 1.8 percent to $37.51 at 9:35 a.m. in New York.

Frank Shaw, a spokesman for Microsoft, declined to comment.

Microsoft has only had two CEOs -- Gates and Ballmer -- in its history. In turning to Nadella, the company would get an enterprise-technology veteran who joined Microsoft in 1992 and has had leadership roles in cloud services, server software, Internet search and business applications.

As president of Microsoft’s server business, Nadella boosted revenue to $20.3 billion in the fiscal year through June, up from $16.6 billion when he took over in 2011. That unit became cloud and enterprise when Ballmer overhauled Microsoft’s structure in July to focus the company on devices and services.

Engineering Corps

Michael Cusumano, a professor at MIT’s Sloan School of Management, said Nadella is a good choice given his close ties and strong reputation within Microsoft’s huge engineering corps.

“Microsoft is a contentious enough place that you wouldn’t want to bring in someone who lacked credibility with the engineers,” Cusumano said.

A bigger shift would be if Gates vacates the chairman role that he has held for years, said Daniel Ives, an analyst at FBR Capital Markets. Gates has been chairman since 1981. Going with an insider for CEO would necessitate turning to an outsider for chairman, Ives said.

“If they are going for the CEO who is right down the hall from Steve Ballmer, you’ve got to give investors a bone,” said Ives, who has the equivalent of a hold rating on the stock. “It would obviously be a big change and a historic change but it’s obvious Microsoft needs change in terms of strategy in terms of the next leg of growth.”

Earlier this month, Gates, 58, who has devoted much of his time to the Bill & Melinda Gates Foundation and charitable efforts, told Bloomberg Television that he will work on philanthropy full time for the rest of his life and contribute part time as a board member of the software maker.

Philanthropic Endeavors

“The board is doing important work right now,” said Gates, who created the foundation with his wife in 2000. “The foundation is the biggest part of my time. I put in part-time work to help as a board member. My full-time work will be the foundation for the rest of my life. I will not change that.”

If Thompson, 64, takes over as chairman, Microsoft would have a more than 40-year technology executive in the role. Thompson was a longtime International Business Machines Corp. executive before joining technology-security company Symantec Corp. (SYMC) in 1999. He took the company from $600 million to $6 billion in sales over a decade-long tenure, before stepping down in 2009. Thompson currently runs Virtual Instruments Inc., a San Jose, California-based maker of software that tracks application and hardware performance.

Thompson joined Microsoft’s board in 2012 and has asked tough questions of top executives including Ballmer, people familiar with the situation have said. That helped to create an environment that sped Ballmer’s decision to retire by this August, the people have said.

Five-Month Search

The leadership changes would cap a five-month search that has been dogged with difficulties. Thompson, who is heading the CEO search, wrote in a blog post last month that the board planned to complete a search for CEO in the “early part of 2014.” He said the board started with more than 100 candidates.

The company considered internal candidates including Executive Vice President Tony Bates and Chief Operating Officer Kevin Turner. External candidates have included former Nokia CEO Stephen Elop, Ford Motor Co. CEO Alan Mulally, Qualcomm Inc. CEO-elect Steve Mollenkopf and Ericsson AB CEO Hans Vestberg, people familiar with the search have said.

Some CEO candidates have declined to be considered or dropped out of the running. Vestberg has said he plans to stay at Ericsson. Mulally took himself out of the running earlier this month.

To contact the reporters on this story: Dina Bass in Seattle at dbass2@bloomberg.net; Peter Burrows in San Francisco at pburrows@bloomberg.net; Jonathan Erlichman in New York at jerlichman1@bloomberg.net

To contact the editor responsible for this story: Pui-Wing Tam at ptam13@bloomberg.net

In Cryptography, Advances in Program Obfuscation

Let’s Bolt! | Parse Blog

$
0
0

Comments:"Let’s Bolt! | Parse Blog"

URL:http://blog.parse.com/2014/01/29/lets-bolt/


When Parse joined Facebook, we immediately started looking for ways to improve our SDKs by comparing code and learning from each others’ successes. We found that there were a lot of small, low-level utility classes in iOS and Android that we had both implemented. Rather than continue to have two versions of these components, we decided to collaborate on one common library between our SDKs. Today, we are open-sourcing that library to make it available to others.

Bolts is a collection of low-level libraries designed to make developing mobile apps easier. Using Bolts does not require using any Parse or Facebook services, nor does it require having a Parse or Facebook developer account. Simply download the jar or framework file and drop it into your project. Or you can download the source directly from GitHub. Documentation for all of the components in Bolts is available on GitHub as well.

The first component in Bolts is “tasks”, which make organization of complex asynchronous code more manageable. A task is kind of like a JavaScript Promise, but available for iOS and Android. For example, if you have an asynchronous method for saving an object in iOS, you can have it return a BFTask*, and handle the result in a continuation block.

[[object saveAsync:obj] continueWithBlock:^id(BFTask *task) {
 if (task.isCancelled) {
 // the save was cancelled.
 } else if (task.error) {
 // the save failed.
 } else {
 // the object was saved successfully.
 SaveResult *saveResult = task.result;
 }
 return nil;
}];

The equivalent code in Android would be:

object.saveAsync().continueWith(new Continuation<ParseObject, Void>() {
 public Void then(Task task) throws Exception {
 if (task.isCancelled()) {
 // the save was cancelled.
 } else if (task.isFaulted()) {
 // the save failed.
 Exception error = task.getError();
 } else {
 // the object was saved successfully.
 SaveResult saveResult = task.getResult();
 }
 return null;
 }
});

Tasks have many advantages over the other models for asynchronous development on these platforms, such as AsyncTask and NSOperation. For more information and example code, please see the platform-specific README files on GitHub.

More Bolts will be coming soon!

Zynga Buys NaturalMotion For $527M, Signaling A New Tack For The Gaming Giant | TechCrunch

$
0
0

Comments:"Zynga Buys NaturalMotion For $527M, Signaling A New Tack For The Gaming Giant | TechCrunch"

URL:http://techcrunch.com/2014/01/30/zynga-naturalmotion/


Zynga has long been famous (or infamous?) for its data-driven approach to game design. The company never focused on building strong character IP, or intellectual property, in favor of releasing games that had been thoroughly funnel-tested.

But now that founding CEO Mark Pincus has stepped aside and let Xbox executive Don Mattrick take the reins, perhaps the company is going in a totally new direction.

Mattrick is sending a big signal on that front today with a $527 million deal to acquire NaturalMotion, the Oxford, U.K.-based gaming company behind franchises like CSR Racing and Clumsy Ninja. The deal involves $391 million in cash and about 39.8 million shares of Zynga stock at yesterday’s price, leaving Zynga with about $1.2 billion in cash and marketable securities on hand. (There is also sad news today, with layoffs for 15 percent of the company’s workforce.)

Torsten Reil, who runs NaturalMotion, doesn’t come from a cookie cutter gaming background.

He had actually been working on his Ph.D. in Complex Systems in Oxford’s zoology department when he decided to go in a totally different direction. He used his biology background to design software that could realistically animate 3D movement. Those products went on to become a middleware business that helped animate games in the Grand Theft Auto franchise and films in the Lord of the Rings trilogy.

Then a few years ago, Reil and NaturalMotion pivoted to building their own games, using their proprietary animation software to make freemium titles like My Horse and most recently, Clumsy Ninja.

Reil is a perfectionist, and he’ll delay games for months until the details are just right. That attention and care has attracted support from key partners like Apple, which let the company demo Clumsy Ninja and CSR Racing on-stage at the WWDC and iPhone 5 keynotes. (Let me just stress that being invited by Apple to go on-stage at their marquee events is like winning the “Best Picture” Oscar for an app developer.)

When he launched Clumsy Ninja last fall (a whole year after the company teased it on-stage at the iPhone 5 launch), Reil told me,

“We want to get the game right. We want to make people laugh and smile. We don’t want to design it to be a hard-core monetizing game. It has to be a delightful, wholesome experience.”

It doesn’t sound like stereotypical Zynga, does it?

Well, times have changed on the iOS platform. It used to be comparatively cheap to launch lots of casual, social games on the platform. But if you look at the top-grossing charts today, they’re almost the same as a year ago with companies like King, Supercell and MachineZone at the top.

That’s because it’s so expensive now to market and acquire users on mobile platforms. So if you’re going to launch a game, it takes much more time and investment than it did a year ago. So that’s why a new and even more deliberate approach is necessary. It’s not enough to fast-follow on proven gaming categories, which was Zynga’s strategy on the Facebook platform.

With this deal, Zynga gets a good portfolio of current and upcoming games, a character with real franchise potential in Clumsy Ninja, a middleware business and a 260-person gaming company that is culturally focused on quality.

As for NaturalMotion and its investors, the company was backed with $11 million from Benchmark Capital and had former EA executive Mitch Lasky on the board. Balderton Capital, which used to be Benchmark Europe, was NaturalMotion’s first venture investor.

Lasky, who overlapped with Mattrick while both were at EA, shared some of his thoughts here.

“NaturalMotion will provide Don with a fantastic slate of mobile products (both new, innovative ones, as well as sequels of their current hits),” he wrote in a blog post. “Combined with Zynga’s reach, social networking expertise, and advanced audience measurement tools, NaturalMotion and Zynga should be a very potent combination.”

GitHub Security Bug Bounty · GitHub

$
0
0

Comments:"GitHub Security Bug Bounty · GitHub"

URL:https://github.com/blog/1770-github-security-bug-bounty


Our users' trust is something we never take for granted here at GitHub. In order to earn and keep that trust we are always working to improve the security of our services. Some vulnerabilities, however, can be very hard to track down and it never hurts to have more eyes.

We are excited to launch the GitHub Bug Bounty to better engage with security researchers. The idea is simple: hackers and security researchers (like you) find and report vulnerabilities through our responsible disclosure process. Then, to recognize the significant effort that these researchers often put forth when hunting down bugs, we reward them with some cold hard cash.

Bounties typically range from $100 up to $5000 and are determined at our discretion based on actual risk and potential impact to our users. For example, if you find a reflected XSS that is only possible in Opera, which is < 2% of our traffic, then the severity and reward will be lower. But a persistent XSS that works in Chrome, which accounts for > 60% of our traffic, will earn a much larger reward.

Right now our bug bounty program is open for a subset of our products and services (full list is on the site), but we are already planning on expanding the scope as the things warm up.

Check out the GitHub Bug Bounty site for full details, and happy hunting!

Need help or found a bug? Contact us.


7 Minute Workout by Quick Fit

We Are Giving Ourselves Cancer

PARKERWhy I'll be Pirating Adobe's Products From Now On | PARKER

$
0
0

Comments:"PARKERWhy I'll be Pirating Adobe's Products From Now On | PARKER"

URL:http://parkerrr.com/ill-pirating-adobes-products-now/


Edit: I’ve had some questions on software that I’ve been using in the interim. For Vector design, Sketch is a near perfect replacement. As for Photoshop, honestly, finding a perfect replacement has been hard, but one of the best applications I’ve been looking at is Pixelmator. If you have any suggestions for replacement software, with great support, leave a comment!

Adobe and I had a great relationship. I paid them money, they provided me with a fantastic product.
While I was in college, my biggest investment was the Adobe Master Suite CS5. It was everything I’d ever need. And no, It wasn’t part of my tuition. I worked long hours to cover that bill, but I knew it was worth every penny. Photoshop, and Illustrator were my preferred applications, a systematic ménage à trois that produced some of my greatest work to date. But oh, the heartbreak, when I find out my product (relationship) is no longer supported.

It all started with my new Macbook Retina. A gleaming beacon of self-worth, and productivity (Yes, Mac IS more productive). I had to reinstall all of my Adobe Software Master Suite onto it, sans a cd-rom… So I did what many people do. I looked for a download of the software, but to my avail, there were none for the “Student/Teacher License”. I understand there is a difference in licensing, but all I wanted was the same product I had spent all my hard earned cash on, not an upgrade, no special treatment.

I called Adobe support, and after waiting on the line for a good 35 minutes, I was greeted by Adobe’s “International” support team, based out of India where they can pay pennies on the dollar for you to get support in broken english. It’s almost as good as real english… almost. After explaining my ordeal to the support tech, he guided me to Adobe’s website, asked me to sign into my account and hurriedly told me the download link would be under the “My Products” section. I kept him on the line while I checked, but couldn’t find the link. I asked him to walk me through it, and proceeded to put me on hold for 2 minutes. When he came back on, he said that I should instead go to the Adobe forums, and the download link will be in there. I followed suit and tried searching for Master Collection, CS5, Student, Teacher, and more keywords, but did not find anything. When I asked him what I should look up in their search, he put me on hold again. Finally he came back to tell me that my product was no longer supported, and my options were to either buy and upgrade or download it from “Somewhere else”. The conversation went like this:
Support:You can download it on the web somewhere.
Me: Is there a specific site you would recommend?
Support: No Sir, you will just have to search it.
Me: All I’ve found is torrents for cracked software.
Support: Yes Sir, that will work.
Me: So you’re telling me I should download an illegal copy of your software, and use my legal serial with it?
Support: Yes Sir, that will work.

Finally he came back to tell me that my product was no longer supported, and my options were to either buy and upgrade or download it from “Somewhere else”.

I was floored. My bright and shining star for everything design, has instructed me to pirate their software. I couldn’t believe it. Why would a company as brilliant, professional, and steadfast as Adobe tell me to pirate their software? I guess you could argue that Adobe themselves weren’t “telling me” to pirate it, but they were giving me no options, and poor support. My next stop, calling Adobe sales department.

After a much shorter hold period (They wouldn’t want to make it difficult for you to pay them now, would they?) I was connected with a sales rep. His manor was very professional, and I could tell he was smart. “Thank god I got a smart sales guy. He’ll understand what to do.”, I said to myself. I explained my predicament and told him what I had talked with Adobe’s support team about. I even told him that the support representative told me that I could pirate their software and it would work with my serial. “Well, that’s not illegal as long as you have your serial applied to the product.”, the sales rep says. So I thought for a moment. Was pirating Adobe’s software truly illegal? I mean, the real purchase is the serial number. You can get the trial software on their site for free, but once you pay up, you receive a license. But, that did not settle my intrigue. I pushed the sales rep harder, but every response was another opportunity for him to get me to upgrade to CS6. I put it to the salesman like this:

Let’s say you purchase a car. Under warranty, something goes wrong with your car which is covered. You bring that car back to the place of purchase and you ask for them to fix your car. They tell you that they cannot fix your car, but they can sell you a new one.

Well, I lost the battle, but I have not lost the war. Situations like these prompt me to ponder, “What is GREAT support?” It’s a complicated struggle for Adobe, maintaining their software standards, fending off hackers and pirates, and providing genuine, gracious, support. I can’t blame the support rep from India for suggesting I pirate their software. I cannot blame the sales rep for pushing me to upgrade, and ignoring the root of my problem. They’re all pieces of the puzzle, and are all informed of only what they need to know. But to truly make an impact, you have to bring a human element back into the game. Most people genuinely want to help. When their hands are tied, they aren’t acting human any more. You have to allow your employees some sort of freedom to help customers in need, Adobe. You’re making it difficult for your employees to succeed, and for your customers to be humbled by your support. Take a walk on the wild side, Adobe, and image in your bank did this to you.
“Your account is no longer supported, you’ll have to purchase our new package in order to access your funds.”

Edit 2: For any of you that read this and think I’m an idiot. Don’t mistake my anger for the true issue. Adobe representatives have no other resolution for my problem, so they say “it’s not supported”. Paying for a license as much as Adobe’s, I would expect better support.

42

Ken Shirriff's blog: Bitcoins the hard way: Using the raw Bitcoin protocol

$
0
0

Comments:"Ken Shirriff's blog: Bitcoins the hard way: Using the raw Bitcoin protocol"

URL:http://www.righto.com/2014/02/bitcoins-hard-way-using-raw-bitcoin.html


All the recent media attention on Bitcoin inspired me to learn how Bitcoin really works, right down to the bytes flowing through the network. Normal people use software[1] that hides what is really going on, but I wanted to get a hands-on understanding of the Bitcoin protocol. My goal was to use the Bitcoin system directly: create a Bitcoin transaction manually, feed it into the system as hex data, and see how it gets processed. This turned out to be considerably harder than I expected, but I learned a lot in the process and hopefully you will find it interesting.

This blog post starts with a quick overview of Bitcoin and then jumps into the low-level details: creating a Bitcoin address, making a transaction, signing the transaction, feeding the transaction into the peer-to-peer network, and observing the results.

A quick overview of Bitcoin

I'll start with a quick overview of how Bitcoin works[2], before diving into the details. Bitcoin is a relatively new digital currency[3] that can be transmitted across the Internet. You can buy bitcoins[4] with dollars or other traditional money from sites such as Coinbase or MtGox[5], send bitcoins to other people, buy things with them at some places, and exchange bitcoins back into dollars.

To simplify slightly, bitcoins consist of entries in a distributed database that keeps track of the ownership of bitcoins. Unlike a bank, bitcoins are not tied to users or accounts. Instead bitcoins are owned by a Bitcoin address, for example 1KKKK6N21XKo48zWKuQKXdvSsCf95ibHFa.

Bitcoin transactions

A transaction is the mechanism for spending bitcoins. In a transaction, the owner of some bitcoins transfers ownership to a new address.

A key innovation of Bitcoin is how transactions are recorded in the distributed database through mining. Transactions are grouped into blocks and about every 10 minutes a new block of transactions is sent out, becoming part of the transaction log known as the blockchain, which indicates the transaction has been made (more-or-less) official.[6] Bitcoin mining is the process that puts transactions into a block, to make sure everyone has a consistent view of the transaction log. To mine a block, miners must find an extremely rare solution to an (otherwise-pointless) cryptographic problem. Finding this solution generates a mined block, which becomes part of the official block chain.

Mining is also the mechanism for new bitcoins to enter the system. When a block is successfully mined, new bitcoins are generated in the block and paid to the miner. This mining bounty is large - currently 25 bitcoins per block (about $19,000). In addition, the miner gets any fees associated with the transactions in the block. Because of this, mining is very competitive with many people attempting to mine blocks. The difficulty and competitiveness of mining is a key part of Bitcoin security, since it ensures that nobody can flood the system with bad blocks.

The peer-to-peer network

There is no centralized Bitcoin server. Instead, Bitcoin runs on a peer-to-peer network. If you run a Bitcoin client, you become part of that network. The nodes on the network exchange transactions, blocks, and addresses of other peers with each other. When you first connect to the network, your client downloads the blockchain from some random node or nodes. In turn, your client may provide data to other nodes. When you create a Bitcoin transaction, you send it to some peer, who sends it to other peers, and so on, until it reaches the entire network. Miners pick up your transaction, generate a mined block containing your transaction, and send this mined block to peers. Eventually your client will receive the block and your client shows that the transaction was processed.

Cryptography

Bitcoin uses digital signatures to ensure that only the owner of bitcoins can spend them. The owner of a Bitcoin address has the private key associated with the address. To spend bitcoins, they sign the transaction with this private key, which proves they are the owner. (It's somewhat like signing a physical check to make it valid.) A public key is associated with each Bitcoin address, and anyone can use it to verify the digital signature.

Blocks and transactions are identified by a 256-bit cryptographic hash of their contents. This hash value is used in multiple places in the Bitcoin protocol. In addition, finding a special hash is the difficult task in mining a block.

Bitcoins do not really look like this. Photo credit: Antana, CC:by-sa

Diving into the raw Bitcoin protocol

The remainder of this article discusses, step by step, how I used the raw Bitcoin protocol. First I generated a Bitcoin address and keys. Next I made a transaction to move a small amount of bitcoins to this address. Signing this transaction took me a lot of time and difficulty. Finally, I fed this transaction into the Bitcoin peer-to-peer network and waited for it to get mined. The remainder of this article describes these steps in detail.

It turns out that actually using the Bitcoin protocol is harder than I expected. As you will see, the protocol is a bit of a jumble: it uses big-endian numbers, little-endian numbers, fixed-length numbers, variable-length numbers, custom encodings, DER encoding, and a variety of cryptographic algorithms, seemingly arbitrarily. As a result, there's a lot of annoying manipulation to get data into the right format.[7]

The second complication with using the protocol directly is that being cryptographic, it is very unforgiving. If you get one byte wrong, the transaction is rejected with no clue as to where the problem is.[8]

The final difficulty I encountered is that the process of signing a transaction is much more difficult than necessary, with a lot of details that need to be correct. In particular, the version of a transaction that gets signed is very different from the version that actually gets used.

Bitcoin addresses and keys

My first step was to create a Bitcoin address. Normally you use Bitcoin client software to create an address and the associated keys. However, I wrote some Python code to create the address, showing exactly what goes on behind the scenes.

Bitcoin uses a variety of keys and addresses, so the following diagram may help explain them. You start by creating a random 256-bit private key. The private key is needed to sign a transaction and thus transfer (spend) bitcoins. Thus, the private key must be kept secret or else your bitcoins can be stolen.

The Elliptic Curve DSA algorithm generates a 512-bit public key from the private key. (Elliptic curve cryptography will be discussed later.) This public key is used to verify the signature on a transaction. Inconveniently, the Bitcoin protocol adds a prefix of 04 to the public key. The public key is not revealed until a transaction is signed, unlike most systems where the public key is made public.

How bitcoin keys and addresses are related

The next step is to generate the Bitcoin address that is shared with others. Since the 512-bit public key is inconveniently large, it is hashed down to 160 bits using the SHA-256 and RIPEM hash algorithms.[9] The key is then encoded in ASCII using Bitcoin's custom Base58Check encoding.[10] The resulting address, such as 1KKKK6N21XKo48zWKuQKXdvSsCf95ibHFa, is the address people publish in order to receive bitcoins. Note that you cannot determine the public key or the private key from the address. If you lose your private key (for instance by throwing out your hard drive), your bitcoins are lost forever.

Finally, the Wallet Interchange Format key (WIF) is used to add a private key to your client wallet software. This is simply a Base58Check encoding of the private key into ASCII, which is easily reversed to obtain the 256-bit private key.

To summarize, there are three types of keys: the private key, the public key, and the hash of the public key, and they are represented externally in ASCII using Base58Check encoding. The private key is the important key, since it is required to access the bitcoins and the other keys can be generated from it. The public key hash is the Bitcoin address you see published.

I used the following code snippet[11] to generate a private key in WIF format and an address. The private key is simply a random 256-bit number. The ECDSA crypto library generates the public key from the private key.[12] The Bitcoin address is generated by SHA-256 hashing, RIPEM-160 hashing, and then Base58 encoding with checksum. Finally, the private key is encoded in Base58Check to generate the WIF encoding used to enter a private key into Bitcoin client software.[1]

Inside a transaction

A transaction is the basic operation in the Bitcoin system. You might expect that a transaction simply moves some bitcoins from one address to another address, but it's more complicated than that. A Bitcoin transaction moves bitcoins between one or more inputs and outputs. Each input is a transaction and address supplying bitcoins. Each output is an address receiving bitcoin, along with the amount of bitcoins going to that address.

A sample Bitcoin transaction. Transaction C spends .008 bitcoins from Transactions A and B.

The diagram above shows a sample transaction "C". In this transaction, .005 BTC are taken from an address in Transaction A, and .003 BTC are taken from an address in Transaction B. (Note that arrows are references to the previous outputs, so are backwards to the flow of bitcoins.) For the outputs, .003 BTC are directed to the first address and .004 BTC are directed to the second address. The leftover .001 BTC goes to the miner of the block as a fee. Note that the .015 BTC in the other output of Transaction A is not spent in this transaction.

Each input used must be entirely spent in a transaction. If an address received 100 bitcoins in a transaction and you just want to spend 1 bitcoin, the transaction must spend all 100. The solution is to use a second output for change, which returns the 99 leftover bitcoins back to you.

Transactions can also include fees. If there are any bitcoins left over after adding up the inputs and subtracting the outputs, the remainder is a fee paid to the miner. The fee isn't strictly required, but transactions without a fee will be a low priority for miners and may not be processed for days or may be discarded entirely.[13] A typical fee for a transaction is 0.0002 bitcoins (about 20 cents), so fees are low but not trivial.

Manually creating a transaction

For my experiment I used a simple transaction with one input and one output, which is shown below. I started by bying bitcoins from Coinbase and putting 0.00101234 bitcoins into address 1MMMMSUb1piy2ufrSguNUdFmAcvqrQF8M5, which was transaction 81b4c832.... My goal was to create a transaction to transfer these bitcoins to the address I created above,1KKKK6N21XKo48zWKuQKXdvSsCf95ibHFa, subtracting a fee of 0.0001 bitcoins. Thus, the destination address will receive 0.00091234 bitcoins.

Structure of the example Bitcoin transaction.

Following the specification, the unsigned transaction can be assembled fairly easily, as shown below. There is one input, which is using output 0 (the first output) from transaction 81b4c832.... Note that this transaction hash is inconveniently reversed in the transaction. The output amount is 0.00091234 bitcoins (91234 is 0x016462 in hex), which is stored in the value field in little-endian form. The cryptographic parts - scriptSig and scriptPubKey - are more complex and will be discussed later.

version01 00 00 00
input count01
inputprevious output hash
(reversed)
48 4d 40 d4 5b 9e a0 d6 52 fc a8 25 8a b7 ca a4 25 41 eb 52 97 58 57 f9 6f b5 0c d7 32 c8 b4 81
previous output index00 00 00 00
script length
scriptSigscript containing signature
sequenceff ff ff ff
output count01
outputvalue62 64 01 00 00 00 00 00
script length
scriptPubKeyscript containing destination address
block lock time00 00 00 00

Here's the code I used to generate this unsigned transaction. It's just a matter of packing the data into binary. Signing the transaction is the hard part, as you'll see next.

How Bitcoin transactions are signed

The following diagram gives a simplified view of how transactions are signed and linked together.[14] Consider the middle transaction, transferring bitcoins from address B to address C. The contents of the transaction (including the hash of the previous transaction) are hashed and signed with B's private key. In addition, B's public key is included in the transaction.

By performing several steps, anyone can verify that the transaction is authorized by B. First, B's public key must correspond to B's address in the previous transaction, proving the public key is valid. (The address can easily be derived from the public key, as explained earlier.) Next, B's signature of the transaction can be verified using the B's public key in the transaction. These steps ensure that the transaction is valid and authorized by B. One unexpected part of Bitcoin is that B's public key isn't made public until it is used in a transaction.

With this system, bitcoins are passed from address to address through a chain of transactions. Each step in the chain can be verified to ensure that bitcoins are being spent validly. Note that transactions can have multiple inputs and outputs in general, so the chain branches out into a tree.

How Bitcoin transactions are chained together.[14]

The Bitcoin scripting language

You might expect that a Bitcoin transaction is signed simply by including the signature in the transaction, but the process is much more complicated. In fact, there is a small program inside each transaction that gets executed to decide if a transaction is valid. This program is written in Script, the stack-based Bitcoin scripting language. Complex redemption conditions can be expressed in this language. For instance, an escrow system can require two out of three specific users must sign the transaction to spend it. Or various types of contracts can be set up.[15]

The Script language is surprisingly complex, with about 80 different opcodes. It includes arithmetic, bitwise operations, string operations, conditionals, and stack manipulation. The language also includes the necessary cryptographic operations (SHA-256, RIPEM, etc.) as primitives. In order to ensure that scripts terminate, the language does not contain any looping operations. (As a consequence, it is not Turing-complete.) In practice, however, only a few types of transactions are supported.[16]

In order for a Bitcoin transaction to be valid, the two parts of the redemption script must run successfully. The script in the old transaction is called scriptPubKey and the script in the new transaction is called scriptSig. To verify a transaction, the scriptSig executed followed by the scriptPubKey. If the script completes successfully, the transaction is valid and the Bitcoin can be spent. Otherwise, the transaction is invalid. The point of this is that the scriptPubKey in the old transaction defines the conditions for spending the bitcoins. The scriptSig in the new transaction must provide the data to satisfy the conditions.

In a standard transaction, the scriptSig pushes the signature (generated from the private key) to the stack, followed by the public key. Next, the scriptPubKey (from the source transaction) is executed to verify the public key and then verify the signature.

As expressed in Script, the scriptSig is:

PUSHDATA
signature data and SIGHASH_ALL
PUSHDATA
public key data
The scriptPubKey is:
OP_DUP
OP_HASH160
PUSHDATA
Bitcoin address (public key hash)
OP_EQUALVERIFY
OP_CHECKSIG

When this code executes, PUSHDATA first pushes the signature to the stack. The next PUSHDATA pushes the public key to the stack. Next, OP_DUP duplicates the public key on the stack. OP_HASH160 computes the 160-bit hash of the public key. PUSHDATA pushes the required Bitcoin address. Then OP_EQUALVERIFY verifies the top two stack values are equal - that the public key hash from the new transaction matches the address in the old address. This proves that the public key is valid. Next, OP_CHECKSIG checks that the signature of the transaction matches the public key and signature on the stack. This proves that the signature is valid.

Signing the transaction

I found signing the transaction to be the hardest part of using Bitcoin manually, with a process that is surprisingly difficult and error-prone. The basic idea is to use the ECDSA elliptic curve algorithm and the private key to generate a digital signature of the transaction, but the details are tricky. The signing process has been described through a19-step process (more info). Click the thumbnail below for a detailed diagram of the process.

The biggest complication is the signature appears in the middle of the transaction, which raises the question of how to sign the transaction before you have the signature. To avoid this problem, the scriptPubKey script is copied from the source transaction into the spending transaction (i.e. the transaction that is being signed) before computing the signature. Then the signature is turned into code in the Script language, creating the scriptSig script that is embedded in the transaction. It appears that using the previous transaction's scriptPubKey during signing is for historical reasons rather than any logical reason.[17] For transactions with multiple inputs, signing is even more complicated since each input requires a separate signature, but I won't go into the details.

One step that tripped me up is the hash type. Before signing, the transaction has a hash type constant temporarily appended. For a regular transaction, this is SIGHASH_ALL (0x00000001). After signing, this hash type is removed from the end of the transaction and appended to the scriptSig.

Another annoying thing about the Bitcoin protocol is that the signature and public key are both 512-bit elliptic curve values, but they are represented in totally different ways: the signature is encoded with DER encoding but the public key is represented as plain bytes. In addition, both values have an extra byte, but positioned inconsistently: SIGHASH_ALL is put after the signature, and type 04 is put before the public key.

Debugging the signature was made more difficult because the ECDSA algorithm uses a random number.[18] Thus, the signature is different every time you compute it, so it can't be compared with a known-good signature.

With these complications it took me a long time to get the signature to work. Eventually, though, I got all the bugs out of my signing code and succesfully signed a transaction. Here's the code snippet I used.

The final scriptSig contains the signature along with the public key for the source address (1MMMMSUb1piy2ufrSguNUdFmAcvqrQF8M5). This proves I am allowed to spend these bitcoins, making the transaction valid.

PUSHDATA 4747
signature
(DER)
sequence30
length44
integer02
length20
X2c b2 65 bf 10 70 7b f4 93 46 c3 51 5d d3 d1 6f c4 54 61 8c 58 ec 0a 0f f4 48 a6 76 c5 4f f7 13
integer02
length20
Y 6c 66 24 d7 62 a1 fc ef 46 18 28 4e ad 8f 08 67 8a c0 5b 13 c8 42 35 f1 65 4e 6a d1 68 23 3e 82
SIGHASH_ALL01
PUSHDATA 4141
public keytype04
X14 e3 01 b2 32 8f 17 44 2c 0b 83 10 d7 87 bf 3d 8a 40 4c fb d0 70 4f 13 5b 6a d4 b2 d3 ee 75 13
Y 10 f9 81 92 6e 53 a6 e8 c3 9b d7 d3 fe fd 57 6c 54 3c ce 49 3c ba c0 63 88 f2 65 1d 1a ac bf cd

The final scriptPubKey contains the script that must succeed to spend the bitcoins. Note that this script is executed at some arbitrary time in the future when the bitcoins are spent. It contains the destination address (1KKKK6N21XKo48zWKuQKXdvSsCf95ibHFa) expressed in hex, not Base58Check. The effect is that only the owner of the private key for this address can spend the bitcoins, so that address is in effect the owner.

OP_DUP76
OP_HASH160a9
PUSHDATA 1414
public key hashc8 e9 09 96 c7 c6 08 0e e0 62 84 60 0c 68 4e d9 04 d1 4c 5c
OP_EQUALVERIFY88
OP_CHECKSIGac

The final transaction

Once all the necessary methods are in place, the final transaction can be assembled. The final transaction is shown below. This combines the scriptSig and scriptPubKey above with the unsigned transaction described earlier.
version01 00 00 00
input count01
inputprevious output hash
(reversed)
48 4d 40 d4 5b 9e a0 d6 52 fc a8 25 8a b7 ca a4 25 41 eb 52 97 58 57 f9 6f b5 0c d7 32 c8 b4 81
previous output index00 00 00 00
script length8a
scriptSig47 30 44 02 20 2c b2 65 bf 10 70 7b f4 93 46 c3 51 5d d3 d1 6f c4 54 61 8c 58 ec 0a 0f f4 48 a6 76 c5 4f f7 13 02 20 6c 66 24 d7 62 a1 fc ef 46 18 28 4e ad 8f 08 67 8a c0 5b 13 c8 42 35 f1 65 4e 6a d1 68 23 3e 82 01 41 04 14 e3 01 b2 32 8f 17 44 2c 0b 83 10 d7 87 bf 3d 8a 40 4c fb d0 70 4f 13 5b 6a d4 b2 d3 ee 75 13 10 f9 81 92 6e 53 a6 e8 c3 9b d7 d3 fe fd 57 6c 54 3c ce 49 3c ba c0 63 88 f2 65 1d 1a ac bf cd
sequenceff ff ff ff
output count01
outputvalue62 64 01 00 00 00 00 00
script length19
scriptPubKey76 a9 14 c8 e9 09 96 c7 c6 08 0e e0 62 84 60 0c 68 4e d9 04 d1 4c 5c 88 ac
block lock time00 00 00 00

A tangent: understanding elliptic curves

Bitcoin uses elliptic curves as part of the signing algorithm. I had heard about elliptic curves before in the context of solving Fermat's Last Theorem, so I was curious about what they are. The mathematics of elliptic curves is interesting, so I'll take a detour and give a quick overview.

The name elliptic curve is confusing: elliptic curves are not ellipses, do not look anything like ellipses, and they have very little to do with ellipses. An elliptic curve is a curve satisfying the fairly simple equation y^2 = x^3 + ax + b. Bitcoin uses a specific elliptic curve called secp256k1 with the simple equation y^2=x^3+7.[25]

Elliptic curve formula used by Bitcoin.

An important property of elliptic curves is that you can define addition of points on the curve with a simple rule: if you draw a straight line through the curve and it hits three points A, B, and C, then addition is defined by A+B+C=0. Due to the special nature of elliptic curves, addition defined in this way works "normally" and forms a group. With addition defined, you can define integer multiplication: e.g. 4A = A+A+A+A.

What makes elliptic curves useful cryptographically is that it's fast to do integer multiplication, but division basically requires brute force. For example, you can compute a product such as 12345678*A = Q really quickly (by computing powers of 2), but if you only know A and Q solving n*A = Q is hard. In elliptic curve cryptography, the secret number 12345678 would be the private key and the point Q on the curve would be the public key.

In cryptography, instead of using real-valued points on the curve, the coordinates are integers modulo a prime.[19] One of the surprising properties of elliptic curves is the math works pretty much the same whether you use real numbers or modulo arithmetic. Because of this, Bitcoin's elliptic curve doesn't look like the picture above, but is a random-looking mess of 256-bit points (imagine a big gray square of points).

The Elliptic Curve Digital Signature Algorithm (ECDSA) takes a message hash, and then does some straightforward elliptic curve arithmetic using the message, the private key, and a random number[18] to generate a new point on the curve that gives a signature. Anyone who has the public key, the message, and the signature can do some simple elliptic curve arithmetic to verify that the signature is valid. Thus, only the person with the private key can sign a message, but anyone with the public key can verify the message.

For more on elliptic curves, see the references[20].

Sending my transaction into the peer-to-peer network

Leaving elliptic curves behind, at this point I've created a transaction and signed it. The next step is to send it into the peer-to-peer network, where it will be picked up by miners and incorporated into a block.

How to find peers

The first step in using the peer-to-peer network is finding a peer. The list of peers changes every few seconds, whenever someone runs a client. Once a node is connected to a peer node, they share new peers by exchanging addr messages whenever a new peer is discovered. Thus, new peers rapidly spread through the system.

There's a chicken-and-egg problem, though, of how to find the first peer. Bitcoin clients solve this problem with several methods. Several reliable peers are registered in DNS under the name bitseed.xf2.org. By doing a nslookup, a client gets the IP addresses of these peers, and hopefully one of them will work. If that doesn't work, a seed list of peers is hardcoded into the client.[26]

nslookup can be used to find Bitcoin peers.

Peers enter and leave the network when ordinary users start and stop Bitcoin clients, so there is a lot of turnover in clients. The clients I use are unlikely to be operational right now, so you'll need to find new peers if you want to do experiments. You may need to try a bunch to find one that works.

Talking to peers

Once I had the address of a working peer, the next step was to send my transaction into the peer-to-peer network.[8] Using the peer-to-peer protocol is pretty straightforward. I opened a TCP connection to an arbitrary peer on port 8333, started sending messages, and received messages in turn. The Bitcoin peer-to-peer protocol is pretty forgiving; peers would keep communicating even if I totally messed up requests.

The protocol consists of about 24 different message types. Each message is a fairly straightforward binary blob containing an ASCII command name and a binary payload appropriate to the command. The protocol is well-documented on the Bitcoin wiki.

The first step when connecting to a peer is to establish the connection by exchanging version messages. First I send a version message with my protocol version number[21], address, and a few other things. The peer sends its version message back. After this, nodes are supposed to acknowledge the version message with a verack message. (As I mentioned, the protocol is forgiving - everything works fine even if I skip the verack.)

Generating the version message isn't totally trivial since it has a bunch of fields, but it can be created with a few lines of Python. makeMessage below builds an arbitrary peer-to-peer message from the magic number, command name, and payload. getVersionMessage creates the payload for a version message by packing together the various fields.

Sending a transaction: tx

I sent the transaction into the peer-to-peer network with the stripped-down Python script below. The script sends a version message, receives (and ignores) the peer's version and verack messages, and then sends the transaction as a tx message. The hex string is the transaction that I created earlier.

The following screenshot shows how sending my transaction appears in the Wireshark network analysis program[22]. I wrote Python scripts to process Bitcoin network traffic, but to keep things simple I'll just use Wireshark here. The "tx" message type is visible in the ASCII dump, followed on the next line by the start of my transaction (01 00 ...).

A transaction uploaded to Bitcoin, as seen in Wireshark.

To monitor the progress of my transaction, I had a socket opened to another random peer. Five seconds after sending my transaction, the other peer sent me a tx message with the hash of the transaction I just sent. Thus, it took just a few seconds for my transaction to get passed around the peer-to-peer network, or at least part of it.

Victory: my transaction is mined

After sending my transaction into the peer-to-peer network, I needed to wait for it to be mined before I could claim victory. Ten minutes later my script received an inv message with a new block (see Wireshark trace below).Checking this block showed that it contained my transaction, proving my transaction worked. I could also verify the success of this transaction by looking in my Bitcoin wallet and by checking online. Thus, after a lot of effort, I had successfully created a transaction manually and had it accepted by the system. (Needless to say, my first few transaction attempts weren't successful - my faulty transactions vanished into the network, never to be seen again.[8])

A new block in Bitcoin, as seen in Wireshark.

My transaction was mined by the large GHash.IO mining pool, into block#279068 with hash 0000000000000001a27b1d6eb8c405410398ece796e742da3b3e35363c2219ee. (The hash is reversed in inv message above: ee19...) Note that the hash starts with a large number of zeros - finding such a literally one in a quintillion value is what makes mining so difficult. This particular block contains 462 transactions, of which my transaction is just one.

For mining this block, the miners received the reward of 25 bitcoins, and total fees of 0.104 bitcoins, approximately $19,000 and $80 respectively. I paid a fee of 0.0001 bitcoins, approximately 8 cents or 10% of my transaction. The mining process is very interesting, but I'll leave that for a future article.

Bitcoin mining normally uses special-purpose ASIC hardware, designed to compute hashes at high speed. Photo credit: Gastev, CC:by

Conclusion

Using the raw Bitcoin protocol turned out to be harder than I expected, but I learned a lot about bitcoins along the way, and I hope you did too. My full code is available on GitHub.[23] My code is purely for demonstration - if you actually want to use bitcoins through Python, use a real library[24] rather than my code.

Notes and references

[1] The original Bitcoin client is Bitcoin-qt. In case you're wondering why qt, the client uses the common Qt UI framework. Alternatively you can use wallet software that doesn't participate in the peer-to-peer network, such as Armory or MultiBit. Or you can use an online wallet such as Blockchain.info.

[2] A couple good articles on Bitcoin are How it works and the very thorough How the Bitcoin protocol actually works.

[3] The original Bitcoin paper is Bitcoin: A Peer-to-Peer Electronic Cash System written by the pseudonymous Satoshi Nakamoto in 2008. The true identity of Satoshi Nakamoto is unknown, although there are many theories.

[4] You may have noticed that sometimes Bitcoin is capitalized and sometimes not. It's not a problem with my shift key - the "official" style is to capitalize Bitcoin when referring to the system, and lower-case bitcoins when referring to the currency units.

[5] In case you're wondering how the popular MtGox Bitcoin exchange got its name, it was originally a trading card exchange called "Magic: The Gathering Online Exchange" and later took the acronym as its name.

[6] For more information on what data is in the blockchain, see the very helpful article Bitcoin, litecoin, dogecoin: How to explore the block chain.

[7] I'm not the only one who finds the Bitcoin transaction format inconvenient. For a rant on how messed up it is, see Criticisms of Bitcoin's raw txn format.

[8] You can also generate transaction and send raw transactions into the Bitcoin network using the bitcoin-qt console. Type sendrawtransaction a1b2c3d4.... This has the advantage of providing information in the debug log if the transaction is rejected. If you just want to experiment with the Bitcoin network, this is much, much easier than my manual approach.

[9] Apparently there's no solid reason to use RIPEM-160 hashing to create the address and SHA-256 hashing elsewhere, beyond a vague sense that using a different hash algorithm helps security. See discussion. Using one round of SHA-256 is subject to a length extension attack, which explains why double-hashing is used.

[10] The Base58Check algorithm is documented on the Bitcoin wiki. It is similar to base 64 encoding, except it omits the O, 0, I, and l characters to avoid ambiguity in printed text. A 4-byte checksum guards against errors, since using an erroneous bitcoin address will cause the bitcoins to be lost forever.

[11] Some boilerplate has been removed from the code snippets. For the full Python code, see GitHub. You will also need the ecdsa cryptography library.

[12] You may wonder how I ended up with addresses with nonrandom prefixes such as 1MMMM. The answer is brute force - I ran the address generation script overnight and collected some good addresses. (These addresses made it much easier to recognize my transactions in my testing.) There are scripts and websites that will generate these "vanity" addresses for you.

[13] For a summary of Bitcoin fees, see bitcoinfees.com. This recent Reddit discussion of fees is also interesting.

[14] The original Bitcoin paper has a similar figure showing how transactions are chained together. I find it very confusing though, since it doesn't distinguish between the address and the public key.

[15] For details on the different types of contracts that can be set up with Bitcoin, see Contracts. One interesting type is the 2-of-3 escrow transaction, where two out of three parties must sign the transaction to release the bitcoins. Bitrated is one site that provides these.

[16] Although Bitcoin's Script language is very flexible, the Bitcoin network only permits a few standard transaction types and non-standard transactions are not propagated (details). Some miners will accept non-standard transactions directly, though.

[17] There isn't a security benefit from copying the scriptPubKey into the spending transaction before signing since the hash of the original transaction is included in the spending transaction. For discussion, see Why TxPrev.PkScript is inserted into TxCopy during signature check?

[18] The random number used in the elliptic curve signature algorithm is critical to the security of signing. Sony used a constant instead of a random number in the PlayStation 3, allowing the private key to be determined. In an incident related to Bitcoin, a weakness in the random number generator allowed bitcoins to be stolen from Android clients.

[19] For Bitcoin, the coordinates on the elliptic curve are integers modulo the prime2^256 - 2^32 - 2^9 -2^8 - 2^7 - 2^6 -2^4 -1, which is very nearly 2^256. This is why the keys in Bitcoin are 256-bit keys.

[20] For information on the historical connection between elliptic curves and ellipses (the equation turns up when integrating to compute the arc length of an ellipse) see the interesting article Why Ellipses Are Not Elliptic Curves, Adrian Rice and Ezra Brown, Mathematics Magazine, vol. 85, 2012, pp. 163-176. For more introductory information on elliptic curve cryptography, see ECC tutorial or A (Relatively Easy To Understand) Primer on Elliptic Curve Cryptography. For more on the mathematics of elliptic curves, seeAn Introduction to the Theory of Elliptic Curves by Joseph H. Silverman.Three Fermat trails to elliptic curves includes a discussion of how Fermat's Last Theorem was solved with elliptic curves.

[21] There doesn't seem to be documentation on the different Bitcoin protocol versions other than the code. I'm using version 60002 somewhat arbitrarily.

[22] The Wireshark network analysis software can dump out most types of Bitcoin packets, but only if you download a recent "beta release - I'm using version 1.11.2.

[23] The full code for my examples is available on GitHub.

[24] Several Bitcoin libraries in Python arebitcoin-python, pycoin, andpython-bitcoinlib.

[25] The elliptic curve plot was generated from the Sage mathematics package:

var("x y")
implicit_plot(y^2-x^3-7, (x,-10, 10), (y,-10, 10), figsize=3, title="y^2=x^3+7")

[26] The hardcoded peer list in the Bitcoin client is in chainparams.cpp in the array pnseed. For more information on finding Bitcoin peers, see How Bitcoin clients find each other or Satoshi client node discovery.

John Amicangelo : The best todo list app for any developer

Viewing all 9433 articles
Browse latest View live