Quantcast
Channel: Hacker News 50
Viewing all 9433 articles
Browse latest View live

Philip Guo - Silent Technical Privilege

$
0
0

Comments:"Silent Technical Privilege"

URL:http://pgbovine.net/tech-privilege.htm


January 2014

When I first read On Technical Entitlement byTess Rinearson in mid-2012, it resonated with me so much that I emailed her. I've been meaning to expand that original email into an article for a while now, so here goes ...

This was me at nine years old (with horrible posture):

I started programming when I was five, first with Logo and then BASIC. By the time this photo was taken, I had already written several BASIC games that I distributed as shareware on our local BBS. I was fast growing bored, so my parents (both software engineers) gave me the original dragon compiler textbook from their grad school days. That's when I started learning C and writing my own simple interpreters and compilers. My early interpreters were for BASIC, but by the time I entered high school, I had already created a self-hosting compiler for a non-trivial subset of C (no preprocessor, though). Throughout most of high school, I spent weekends coding in x86 assembly, obsessed with hand-tuning code for the newly-released Pentium II chips. When I started my freshman year at MIT as a Computer Science major, I already had over ten years of programming experience. So I felt right at home there.

Okay that entire paragraph was a lie. Did you believe me? If so, why? Was it because I looked like a kid programming whiz?

When that photo was taken, I didn't even know how to touch-type. My parents were just like, “Quick, pose in front of our new computer!” (Look closely. My fingers aren't even in the right position.) My parents were both humanities majors, and there wasn't a single programming book in my house. In 6th grade, I tried teaching myself BASIC for a few weeks but quit because it was too hard. The only real exposure I had to programming prior to college was taking AP Computer Science in 11th grade, taught by a math teacher who had learned the material only a month before class started. Despite its shortcomings, that class inspired me to major in Computer Science in college. But when I started freshman year at MIT, I felt a bit anxious because many of my classmates actually did have over ten years of childhood programming experience; I had less than one.

Silent Technical Privilege

Even though I didn't grow up in a tech-savvy household and couldn't code my way out of a paper bag, I had one big thing going for me: I looked like I was good at programming. Here's me during freshman year of college:

As an Asian male student at MIT, I fit society's image of a young programmer. Thus, throughout college, nobody ever said to me:

  • “Well, you only got into MIT because you're an Asian boy.”
  • (while struggling with a problem set) “Well, not everyone is cut out for Computer Science; have you considered majoring in bio?”
  • (after being assigned to a class project team) “How about you just design the graphics while we handle the backend? It'll be easier for everyone that way.”
  • “Are you sure you know how to do this?”

Although I started off as a complete novice (like everyone once was), I never faced anymicro-inequities to impede my intellectual growth. Throughout college and grad school, I gradually learned more and more via classes, research, and internships, incrementally taking on harder and harder projects, and getting better and better at programming while falling deeper and deeper in love with it. Instead of doing my ten years of deliberate practice from ages 8 to 18, I did mine from ages 18 to 28. And nobody ever got in the way of my learning – not even inadvertently – because I looked like the sort of person who would be good at such things.

Instead of facing implicit bias or stereotype threat, I had the privilege of implicit endorsement. For instance, whenever I attended technical meetings, people would assume that I knew what I was doing (regardless of whether I did or not) and treat me accordingly. If I stared at someone in silence and nodded as they were talking, they would usually assume that I understood, not that I was clueless. Nobody ever talked down to me, and I always got the benefit of the doubt in technical settings.

As a result, I was able to fake it till I made it, often landing jobs whose postings required skills I hadn't yet learned but knew that I could pick up on the spot. Most of my interviews for research assistantships and summer internships were quite casual – I looked and sounded like I knew what I was doing, so people just gave me the chance to try. And after enough rounds of practice, I actually did start knowing what I was doing. As I gained experience, I was able to land more meaningful programming jobs, which led to a virtuous cycle of further improvement.

This kind of privilege that I – and other people who looked like me – possessed was silent, manifested not in what people said, but rather in what they didn't say. We had theprivilege to spend enormous amounts of time developing technical expertise without anyone's interference or implicit discouragement. Sure, we worked really hard, but our efforts directly translated into skill improvements without much loss due to interpersonal friction. Because we looked the part.

The other side

In contrast, ask any Computer Science major who isn't from a majority demographic (i.e., white or Asian male), and I guarantee that they've encountered discouraging quotes like “You know, not everyone is cut out for Computer Science ...” They probably still remember the words and actions that have hurt the most, even though those making the remarks often aren't trying to harm.

For example, one of my good friends took the Intro to Java course during freshman year and enjoyed it. She wanted to get better at Java GUI programming, so she got a summer research assistantship at the MIT Media Lab. However, instead of letting her build the GUI (like the job ad described), the supervisor instead assigned her the mind-numbing task of hand-transcribing audio clips all summer long. He assigned a new male student to build the GUI application. And it wasn't like that student was some child programming prodigy – he was also a freshman with the same amount of (limited) experience as she had! That other student spent the summer getting better at GUI programming while she just grinded away mindlessly transcribing audio. As a result, she grew resentful and shied away from learning more CS.

Thinking about this story always angers me: Here was someone with a natural interest who took the initiative to learn more and was denied the opportunity to do so. I have no doubt that she could have gotten good at programming – and really enjoyed it! – if she had the same opportunities as I did. That spark was there in her during freshman year but was snuffed out by one bad initial experience.

(Also, when she first got into MIT, her aunt – whose son had been rejected – congratulated her by saying, “Well, you only got into MIT because you're a girl.”)

Over a decade later, she now does some programming at her research job but wished that she had learned more back in college. However, she had such a negative association with everything CS-related that it was hard to motivate herself to seek further learning opportunities for fear of being shot down again.

Programmers aren't superheroes

One trite retort is, “Well your friend should've been tougher and not given up so easily. If she wanted it badly enough, she should've tried again, even knowing that she might face resistance.”

These sorts of remarks also aggravate me. Writing code for a living isn't like being in the Navy SEALs hunting down international terrorists or simultaneously shooting three pirates in the head at sea while they were pointing a gun at a civilian. Programming is seriously not that demanding, so you shouldn't need to be a tough-as-nails superhero to enter this profession.

Just look at this photo of me from a software engineering summer internship:

Even though I was hacking on a hardware simulator in C++, which sounds mildly hardcore, I was actually pretty squishy, chillin' in my cubicle and often taking extended lunch breaks. All of the guys around me (yes, the programmers were all men, with the exception of one older woman who didn't hang out with us) were also fairly squishy. These guys made a fine living and were good at what they did; but seriously, they weren't superheroes. The most hardship that one of the guys faced all summer was staying up late playing the game Doom 3 when it first came out and then rolling into the office dead-tired the next morning. Anyone with enough practice and motivation could have done this job, and most other programming and CS-related jobs as well. Seriously, companies aren't looking to hire the next Steve Wozniak – they just want to ship code that works.

It frustrates me that people not in the majority demographic often need to be tough as nails to succeed in this field, constantly bearing thelasting effects of thousands of micro-inequities. One researcher notes that:

[...] micro-inequities often had serious cumulative, harmful effects, resulting in hostile work environments and continued minority discrimination in public and private workplaces and organizations. What makes micro-inequities particularly problematic is that they consist in micro-messages that are hard to recognize for victims, bystanders and perpetrators alike. When victims of micro-inequities do recognize the micro-messages, Rowe argues, it is exceedingly hard to explain to others why these small behaviors can be a huge problem.

In contrast, people who look like me can just kinda do programming for work if we want, or not do it, or switch into it later, or out of it again, or work quietly, or nerd-rant on how Ruby sucks or rocks or whatever, or name-drop monads. And nobody will make remarks about our appearance, about whether we're truly dedicated hackers, or how our behavior might reflect badly on “our kind” of people. That's silent technical privilege.

Conclusion

Here's a thought experiment: For every white or Asian male expert programmer you know, imagine a parallel universe where they were of another ethnicity and/or gender but had the exact same initial interest and aptitude levels. Would they still have been willing to devote the over ten thousand hours of deliberate practice to achieve mastery in the face of dozens or hundreds of instances of implicit discouragement they will inevitably encounter over the years? Sure, some super-resilient outliers would, but many wouldn't. Many of us would quit, even though we had the potential and interest to thrive in this field.

I hope to live in a future where people who already have the interest to pursue CS or programming don't self-select themselves out of the field. I want those people to experience what I was privileged enough to have gotten in college and beyond – unimpeded opportunities to develop expertise in something that they find beautiful, practical, and fulfilling.

The bigger goal on this front is to spur interest in young people from underrepresented demographics who might never otherwise think to pursue CS or STEM studies in general. There are great people and organizations working toward this goal on multiple fronts. Although I think that increased and broader participation is critical, a more immediate concern is reducing attrition of those already in the field. For instance, a 2012 STEM education report to the President

[...] found that economic forecasts point to a need for producing, over the next decade, approximately 1 million more college graduates in STEM fields than expected under current assumptions. Fewer than 40% of students who enter college intending to major in a STEM field complete a STEM degree. Merely increasing the retention of STEM majors from 40% to 50% would generate three quarters of the targeted 1 million additional STEM degrees over the next decade.

That's why I plan to start by taking steps to encourage and retain those who already want to learn.

Created: 2014-01-05
Last modified: 2014-01-05

Related pages tagged as personal:

Related pages tagged as education:

Related pages tagged as undergrad education:

Related pages tagged as CS Education:


Lexical Distance Among the Languages of Europe « Etymologikon™

$
0
0

Comments:"Lexical Distance Among the Languages of Europe « Etymologikon™"

URL:http://elms.wordpress.com/2008/03/04/lexical-distance-among-languages-of-europe/


Lexical Distance Among the Languages of Europe

Posted by Teresa Elms on 4 March 2008

 

This chart shows the lexical distance — that is, the degree of overall vocabulary divergence — among the major languages of Europe.

The size of each circle represents the number of speakers for that language. Circles of the same color belong to the same language group. All the groups except for Finno-Ugric (in yellow) are in turn members of the Indo-European language family.

English is a member of the Germanic group (blue) within the Indo-European family. But thanks to 1066, William of Normandy, and all that, about 75% of the modern English vocabulary comes from French and Latin (ie the Romance languages, in orange) rather than Germanic sources. As a result, English (a Germanic language) and French (a Romance language) are actually closer to each other in lexical terms than Romanian (a Romance language) and French.

So why is English still considered a Germanic language? Two reasons. First, the most frequently used 80% of English words come from Germanic sources, not Latinate sources. Those famous Anglo-Saxon monosyllables live on! Second, the syntax of English, although much simplified from its Old English origins, remains recognizably Germanic. The Norman conquest added French vocabulary to the language, and through pidginization it arguably stripped out some Germanic grammar, but it did not ADD French grammar.

The original research data for the chart comes from K. Tyshchenko (1999), Metatheory of Linguistics. (Published in Russian.)

Like this:

LikeLoading...

This entry was posted on 4 March 2008 at 3:28 pm and is filed under Linguistics. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Why Everyone Seems to Have Cancer - NYTimes.com

$
0
0

Comments:"Why Everyone Seems to Have Cancer - NYTimes.com"

URL:http://www.nytimes.com/2014/01/05/sunday-review/why-everyone-seems-to-have-cancer.html?smid=tw-share&pagewanted=all


Half a century ago, the story goes, a person was far more likely to die from heart disease. Now cancer is on the verge of overtaking it as the No. 1 cause of death.

Troubling as this sounds, the comparison is unfair. Cancer is, by far, the harder problem — a condition deeply ingrained in the nature of evolution and multicellular life. Given that obstacle, cancer researchers are fighting and even winning smaller battles: reducing the death toll from childhood cancers and preventing — and sometimes curing — cancers that strike people in their prime. But when it comes to diseases of the elderly, there can be no decisive victory. This is, in the end, a zero-sum game.

The rhetoric about the war on cancer implies that with enough money and determination, science might reduce cancer mortality as dramatically as it has with other leading killers — one more notch in medicine’s belt. But what, then, would we die from? Heart disease and cancer are primarily diseases of aging. Fewer people succumbing to one means more people living long enough to die from the other.

The newest cancer report, which came out in mid-December, put the best possible face on things. If one accounts for the advancing age of the population — with the graying of the baby boomers, death itself is on the rise — cancer mortality has actually been decreasing bit by bit in recent decades. But the decline has been modest compared with other threats.

A graph from the Centers for Disease Control and Prevention tells the story. There are two lines representing the age-adjusted mortality rate from heart disease and from cancer. In 1958 when the diagram begins, the line for heart disease is decisively on top. But it plunges by 68 percent while cancer declines so slowly — by only about 10 percent — that the slope appears far less significant.

Measuring from 1990, when tobacco had finished the worst of its damage and cancer deaths were peaking, the difference is somewhat less pronounced: a decline of 44 percent for heart disease and 20 percent for cancer. But as the collision course continues, cancer seems insistent on becoming the one left standing — death’s final resort. (The wild card in the equation is death from complications of Alzheimer’s disease, which has been advancing year after year.)

Though not exactly consoling, the fact that we have reached this standoff is a kind of success. A century ago average life expectancy at birth was in the low to mid-50s. Now it is almost 79, and if you make it to 65 you’re likely to live into your mid-80s. The median age of cancer death is 72. We live long enough for it to get us.

The diseases that once killed earlier in life — bubonic plague, smallpox, influenza, tuberculosis — were easier obstacles. For each there was a single infectious agent, a precise cause that could be confronted. Even AIDS is being managed more and more as a chronic condition.

Progress against heart disease has been slower. But the toll has been steadily reduced, or pushed further into the future, with diet, exercise and medicines that help control blood pressure and cholesterol. When difficulties do arise they can often be treated as mechanical problems — clogged piping, worn-out valves — for which there may be a temporary fix.

Because of these interventions, people between 55 and 84 are increasingly more likely to die from cancer than from heart disease. For those who live beyond that age, the tables reverse, with heart disease gaining the upper hand. But year by year, as more failing hearts can be repaired or replaced, cancer has been slowly closing the gap.

For the oldest among us, the two killers are fighting to a draw. But there are reasons to believe that cancer will remain the most resistant. It is not so much a disease as a phenomenon, the result of a basic evolutionary compromise. As a body lives and grows, its cells are constantly dividing, copying their DNA — this vast genetic library — and bequeathing it to the daughter cells. They in turn pass it to their own progeny: copies of copies of copies. Along the way, errors inevitably occur. Some are caused by carcinogens but most are random misprints.

Over the eons, cells have developed complex mechanisms that identify and correct many of the glitches. But the process is not perfect, nor can it ever be. Mutations are the engine of evolution. Without them we never would have evolved. The trade-off is that every so often a certain combination will give an individual cell too much power. It begins to evolve independently of the rest of the body. Like a new species thriving in an ecosystem, it grows into a cancerous tumor. For that there can be no easy fix.

These microscopic rebellions have been happening for at least half a billion years, since the advent of complex multicellular life — collectives of cells that must work together, holding back, as best each can, the natural tendency to proliferate. Those that do not — the cancer cells — are doing, in a Darwinian sense, what they are supposed to do: mutating, evolving and increasing in fitness compared with their neighbors, the better behaved cells of the body. And these are left at a competitive disadvantage, shackled by a compulsion to obey the rules.

As people age their cells amass more potentially cancerous mutations. Given a long enough life, cancer will eventually kill you — unless you die first of something else. That would be true even in a world free from carcinogens and equipped with the most powerful medical technology.

Faced with this inevitability, there have been encouraging reductions in the death toll from childhood cancer, with mortality falling by more than half since 1975. For older people, some early-stage cancers — those that have not learned to colonize other parts of the body — can be cured with a combination of chemicals, radiation therapy and surgery. Others can be held in check for years, sometimes indefinitely. But the most virulent cancers have evolved such wily subterfuges (a survival instinct of their own) that they usually prevail. Progress is often measured in a few extra months of life.

OVER all, the most encouraging gains are coming from prevention. Worldwide, some 15 to 20 percent of cancers are believed to be caused by infectious agents. With improvements in refrigeration and public sanitation, stomach cancer, which is linked to Helicobacter pylori bacteria, has been significantly reduced, especially in more developed parts of the world. Vaccines against human papilloma virus have the potential of nearly eliminating cervical cancer.

Where antismoking campaigns are successful, lung cancer, which has accounted for almost 30 percent of cancer deaths in the United States, is steadily diminishing. More progress can be made with improvements in screening and by reducing the incidence of obesity, a metabolic imbalance that, along with diabetes, gives cancer an edge.

Surprisingly, only a small percentage of cancers have been traced to the thousands of synthetic chemicals that industry has added to the environment. As regulations are further tightened, cancer rates are being reduced a little more.

Most of the progress has been in richer countries. With enough political will the effort can be taken to poorer parts of the world. In the United States, racial disparities in cancer rates must be addressed. But there is a long way to go. For most cancers the only identifiable cause is entropy, the random genetic mutations that are an inevitable part of multicellular life.

Advances in the science will continue. For some cancers, new immune system therapies that bolster the body’s own defenses have shown glints of promise. Genomic scans determining a cancer’s precise genetic signature, nano robots that repair and reverse cellular damage — there are always new possibilities to explore.

Maybe someday some of us will live to be 200. But barring an elixir for immortality, a body will come to a point where it has outwitted every peril life has thrown at it. And for each added year, more mutations will have accumulated. If the heart holds out, then waiting at the end will be cancer.

George Johnson is a former reporter and editor at The New York Times and the author of “The Cancer Chronicles.”

Thousands of visitors to yahoo.com hit with malware attack, researchers say

$
0
0

Comments:"Thousands of visitors to yahoo.com hit with malware attack, researchers say"

URL:http://www.washingtonpost.com/blogs/the-switch/wp/2014/01/04/thousands-of-visitors-to-yahoo-com-hit-with-malware-attack-researchers-say/?tid=hpModule_88854bf0-8691-11e2-9d71-f0feafdd1394


The Yahoo logo is shown at the company's headquarters in Sunnyvale, Calif. in this April 16, 2013 file photo. (Robert Galbraith/Reuters)

Two Internet security firms have reported that Yahoo's advertising servers have been distributing malware to hundreds of thousands of users over the last few days. The attack appears to be the work of malicious parties who have hijacked Yahoo's advertising network for their own ends.

Fox IT, a security firm based in the Netherlands, wrote a blog post on Friday describing the problem. "Clients visiting yahoo.com received advertisements served by ads.yahoo.com. Some of the advertisements are malicious," the firm reported. Instead of serving ordinary ads, the Yahoo's servers reportedly sends users an "exploit kit" that "exploits vulnerabilities in Java and installs a host of different malware."

Ashkan Soltani, a security researcher and Washington Post contributor, alerted me to the issue. Often, he says, such attacks are "the result of hacking an existing ad network." But there's another possibility, he says. The culprits may have simply submitted the malicious software as ordinary ads, sneaking past Yahoo's system for filtering out malicious submissions.

Fox IT says Yahoo users have been getting infected since at least Dec. 30. At the time it discovered the issue on Friday, the firm says, malicious payloads were being delivered to around 300,000 users per hour. The company guesses that around 9 percent of those, or 27,000 users per hour, were being infected. More recently, the firm says, the volume of infections has tapered off, perhaps due to efforts by Yahoo's security team.

"It is unclear which specific group is behind this attack, but the attackers are clearly financially motivated," the firm writes. Fox IT suggests that whoever is behind the attack may be selling control over the victims' computers to other online criminals.

Another security researcher based in the Netherlands, Mark Loman, has confirmed seeing the malware. His firm, Surfright, makes anti-virus software.

The fact that the malware targeted flaws in the Java programming environment is an important reminder that the software has become a security menace. When it was created almost two decades ago, the Java programming language was hailed as a way to make Web sites more interactive. But it has been largely superseded for this purpose by technologies like Flash and JavaScript.

As Java's Web plugin has declined in popularity among legitimate Web developers, its security flaws have become a juicy target for hackers. Some browser vendors are moving toward blocking the technology outright. And security experts recommend that if your browser supports it, you should disable Java (but not JavaScript, a completely separate technology) as a precaution.

Update: "At Yahoo, we take the safety and privacy of our users seriously," a Yahoo spokeswoman said in a Saturday email to the Washington Post. "We recently identified an ad designed to spread malware to some of our users. We immediately removed it and will continue to monitor and block any ads being used for this activity."

My Nerd Story | crystal beasley

$
0
0

Comments:"My Nerd Story | crystal beasley"

URL:http://skinnywhitegirl.com/blog/my-nerd-story/1101/


Hi, Paul Graham. My name is Crystal and I’ve been hacking for the past 29 years. I don’t know how you intended your comments but oh lordy, the internet has had a fun time speculating. I’ll leave that commentary to others but I do want to say I’m happy you brought up this debate about gender disparity in tech, as it has sparked some excellent conversation about how we’re going to fix it.

I believe we will fix it in part by having role models. In that spirit, I offer my own nerd story.

I was a poor kid from Arkansas eating government cheese, raised by my grandmother. I didn’t know I was poor. For Christmas 1984, I asked for a computer. Santa brought me a navy and yellow toy computer called a Whiz Kid. I was disappointed. I meant a real computer.

When I was in middle school, I went to a nerd summer camp called “Artificial Intelligence” in 1991 at Harding University. It was my first taste of programming. The language? LISP. Their comp sci department had VAX mainframes to tinker around with Pine email, send the other campers chat messages and use telnet to play text adventure games. My username was Cleopatra on Medievia.

When I was twelve, I saved up $500 and bought an 8088 computer. Remember the 386/486 and the Pentium? This was the one before it. I had enough money to spring for the optional 20MB hard drive, a wise investment. I needed that to save all the images I was about to create with Paint and to play Wolfenstein 3D, Where in the World is Carmen Sandiego and QBasic Gorillas. It was on that computer I learned the universal hardware repair rule that you don’t put the case on and screw in all the tiny screws until you’re absolutely sure the new power supply is working.

At fourteen I was programming my TI-85 to graph equations and check my algebra solutions. It’s not clear if it was cheating. I suspect the teachers knew I was going to make the highest grade on the math test regardless.

At seventeen, I had a perfect score, 36, on the science part of the ACT and a 33 in math which qualified me for a generous scholarship. Arkansas doesn’t have an early graduation provision, so I dropped out of high school, took my GED and enrolled in university a whole year early. I majored in computer science. I later added a major in fine art and moved comp sci to be my minor. The natural intersection of design and programming seemed to be web design, which is the field I’m still in.

This career has given me enormous opportunities. I’m typing this from Phnom Penh, Cambodia where I’ve come to volunteer my professional skills to help good causes in developing countries. I travel around the world pushing the web forward as a Product Designer at Mozilla. Before that I was a LOLcat herder for I Can Has Cheezburger. I’ve spoken at conferences in Barcelona, New York and of course, my home in Portland, Oregon. I love the flexibility and creativity of my career, and I’m incredibly fortunate and grateful to live this life.

So what now? If you’re a man, please share this on the social media platform of your choice. Women are half as likely to be retweeted as men. Want to do more? Ask a woman you admire to tell her story. If you’re a woman, write up your nerd origins and share it with the hashtag #mynerdstory. The 13-year-olds of today need role models from every racial, ethnic and socioeconomic backgrounds. The adult women need role models, like Melinda Byerley who learned HTML and CSS at 42 so she could hack on her startup’s website. We need to hear your story, too.

Where the best designers go to find photos and graphics

$
0
0

Comments:"Where the best designers go to find photos and graphics "

URL:http://www.sitebuilderreport.com/websites-101/design-guides/where-the-best-designers-go-to-find-photos-and-graphics


Websites 101Where the best designers go to find photos and graphics

Beautiful websites aren’t made, they’re found. Smart designers know where to find perfect photos and graphics.

I’ll let you in on a little secret: beautiful websites aren’t made, they’re found. Smart designers know where to find that perfect photo, subtle pattern or that unique icon.

Here’s where the best designers go to find photos, graphics, icons, and more.

Stunning Photography

Photography is what separates a good website from a great website. There may be no better example of great photography than Apple’s website. Apple loves to showcase products with huge, eye popping photography. It’s a great example of how to use photos.

Here’s where the best designers find photos:

New Old Stock Photos– Awesome vintage photos from public archives. Completely free of known copyright restrictions.

Sample photo from New Old Stock Photos

Super Famous– Lots of geological, biological and aerial photography from from Dutch interaction designer Folkert Gorter. Sound boring? It’s not. These photos are one of a kind.

Unsplash– A free email newsletter stuffed with hi-resolution photos. Sent every ten days.

Little Visuals – Another free email newsletter. Sends weekly batches of 7 gorgeous photos.

Comp Fight – The fastest way to find Creative Commons images to use on your blog or website.

Pic Jumbo– Big, searchable database of totally free to use photos.

Sample photo from Pic Jumbo

Sharp Icons

The next time you’re browsing Facebook, Google or Twitter, take note of how many icons you’ll see. They’re everywhere. That’s because icons are an essential part of web design.

Here’s where smart graphic designers go to find icons:

The Noun Project – The grand daddy. Enormous database of over 25,000 icons and growing daily. Each icon has a similar format so they always look professional and consistent.

Icons from The Noun Project

Icon Monstr – Free, simple icons discoverable through a search interface. Over 2,000 icons.

Icon Sweets– Love iOS7? This downloadable library has over 1,000 icons in the style of iOS7 (iOS7 is known for having outlined icons). Available for only $10.

Graphics & Logos

Creative Market – Amazing, handcrafted graphics from designers around the world. I always find top-rate stuff here.

Creative Market

99 Designs– Crowdsource your logo: hold a contest and have 20 – 30 designers submit logo entries. Choose the one you like.

Scoop Shoot– Need photos from around town? ScoopShoot lets you hire people around town to take photos with their phones.

Subtle Patterns– Over 369, subtle (duh) patterns. They all work by being repeated, so they’re perfect for websites.

Subtle Patterns

Colour Lovers– Have difficulty choosing which colours to use? Colour Lovers has literally millions of colour palettes created by users that are ready for you to use.

Two tools that will make you a design rockstar

Pixlr– Pixlr is like a free, online version of Photoshop.

Placeit– An incredible website. Just drag and drop screenshots and it will generate shots of your screenshot in realistic environments.

Placeit – Just upload a screenshot and Placeit will automatically place it in a new context.

Bonus: Reader submitted resources

Savvy readers have mentioned additional tools that I hadn’t included. Here’s some that have stuck out to me:

Gimp– A bit of a learning curve, but a free tool for image editing (Thanks to Jesus Bejarano and Rekasays in the comments section).

Creative Commons images on Flickr– Thanks to 3stripe on Hacker News.

Enjoyed this guide? We'd love if you shared it on Twitter or Facebook!


Evidence my ISP is tracking their customers and selling the data.

$
0
0

Comments:"Evidence my ISP is tracking their customers and selling the data."

URL:http://haydenjameslee.com/evidence-my-isp-may-be-making-money-from-tracking-its-customers/


Evidence my ISP may be making money from tracking its customers

Jan 5, 2014

 

tl;dr: My ISP, Access Media 3 has started injecting tracking cookies into html packets going through their network and are potentially making money from tracking their customers.

—————–

About a week ago I started noticing something strange. I was viewing YouTube and saw a white bar (pictured below) at the top of the page. I didn’t think much of it until I visited StackOverflow and saw the exact same white bar.

I opened the chrome dev tools and found a few javascript errors:

Upon further inspection it turns out this ‘random script’ had been injected by a <script> tag in the header. I looked at some other sites and noticed the same script being inserted almost everywhere. Here is what it looks like:

<script type=”text/javascript”> var dot=’.'; var setCookie=’net’;var gAnalytic=’adsvc1107131′;var IETest=’rxg’; var v=’ashx’; var R=’ajs’; var gid=’5d738f4aeccb49c39d3013cabc563f64′; </script>
<script type=”text/javascript” src=”http://rxg.adsvc1107131.net/ajs.ashx?t=1&amp;5d738f4aeccb49c39d3013cabc563f64″ id=”js-1006893410″ data-loaded=”true”></script>

I realized that the only sites that weren’t affected were those using https rather than http. This makes sense, you can’t inject code into https because it is encrypted.

The effect of this script was to add an iframe to YouTube and StackOverflow however other pages (including ones I’ve built myself) had no injected iframes and only the script tags in the <head>. My theory is that this is related to sites that provide ads however I have not confirmed this.

Here is a gist of the iframe that was being injected into YouTube.

tl;dr: The iframe is coming from Ad-vantage Networks.

I did a whois on some of the domains where these scripts are being hosted and they pointed to Ad-vantage Networks also. Or they were pretty obvious urls like: advn.net. I followed some of the urls around and found an interesting open folder which stores a bunch of the javascript that Ad-vantage uses:

http://adsmws.advn.net/

I poked around on Google and found that Ad-vantage Networks is now known as MediaShift.

So who is injecting the code?

My initial thoughts were that it was just a simple Chrome extension. So I checked the site on Firefox and my Nexus… same result. I plugged in my ethernet cord to rule out my wireless router… same result. Same white bar at the top of YouTube. I switched my Nexus over to 3G and voila! The white bar disappeared. Something in between my wall and YouTube was injecting this code.

I ran mtr to see if there were any suspicious hops that my packets were routing through and this was the result:

Nothing out of the ordinary, at least to my untrained eye (I’m by no means a networks expert).

Plot twist time

Around the same time that I started seeing this injected code I was building a Node.js website and noticed a weird change in behavior. Usually when my node server was off and I accidentally hit its url I received the standard Chrome “This webpage is not available” page. With no change on my behalf, I started seeing different error pages as shown below.

From:

To:

At the time I didn’t think much of it at all. Now I believe it shows the vital clue in this whole situation. But before figuring that out I did some more research into what MediaShift did. Here’s a slide from their front page that was particularly interesting. Internet network providers you say? I dug further into their site and found their list of partners.

The kicker

After looking through these providers for more info I found the final piece of the puzzle. RGNets.com‘s main product is the rXg box. Look back to that new error page I was seeing. Here is the fine print:

Generated Sat, 04 Jan 2014 23:52:15 GMT by va-bbg-core-rxg2.am3wireless.com (squid/3.3.3)

Notice three interesting points:

The machine seems to be an rXg made by RGNets.com Its owned by am3, Access Media 3, my ISP It is a squid server

Some research into squid servers shows this ability. Most interestingly the ability to “Add, remove, or modify an HTTP header field (e.g., Cookie)”. Which is exactly the injection I was seeing.

Conclusion

Access Media 3, my ISP (which we are forced to use in my apartment complex), is using an rXg machine to inject javascript and cookies into any un-encrypted html packets going through my network.

Implications

As the injected javascript is obfuscated in most circumstances I have no idea what the effect of the injection is exactly. At the very least I can see multiple references to persisting cookies – a way to track a user’s behavior on the internet. As seen by MediaShift’s website it is clear that they offer this data collection system as a way for networks to make money. Its therefore not too much of a stretch to conclude that Access Media is making money from selling the data of its users behavior to unknown parties.

I’m certainly not ok with this at all, and I assume most people wouldn’t be. I skimmed through my Access Media contract and they do mention they have the right to ‘monitor’ the traffic across their network, however if by monitor they mean ‘conduct XSS injections against every user’ I know a lot of people will not be happy, especially with the current state of affairs regarding internet security and tracking.

I’ll let Kim DotCom explain why its important that this tracking does not happen:


Apparently similar behavior has been reported before by other ISPs:

http://arstechnica.com/tech-policy/2013/04/how-a-banner-ad-for-hs-ok/

http://erichelgeson.github.io/blog/2013/12/31/i-fought-my-isps-bad-behavior-and-won/

I’ve sent an email to Access Media so we’ll see what their response is.

 

Ian's Shoelace Site - Lacing Shoes

$
0
0

Comments:" Ian's Shoelace Site - Lacing Shoes"

URL:http://www.fieggen.com/shoelace/lacing.htm


Search This Site

Lacing Shoes
Are all of your shoes, sneakers and boots still laced up the way they were when you bought them? This section presents some of the many fascinating ways of lacing, either for different functions or just for appearances. Why not take the plunge? Whip out those laces and re-do them to suit your needs or personality.
Table of Contents
Lacing Methods
This section presents a fairly extensive selection of 39 shoe lacing tutorials. They include traditional and alternative lacing methods that are either widely used, have a particular feature or benefit, or that I just like the look of.
Bi-Color Lacing Methods
Lacing shoes with two different colors is a great way to display country or team colors or simply to make use of the spare shoelaces that are supplied with many sneakers nowadays.
Lug Lacing Methods
Many shoes, sneakers and boots come with lugs instead of eyelets. This section presents a number of variations of regular Lacing Methods that are suitable for shoes with lugs.
2 Trillion Methods?
On an average shoe with six pairs of eyelets, there are almost 2 Trillion ways to feed a shoelace through those twelve eyelets! Impossible? This page shows the maths behind that extraordinary number.
Lacing Comparison
I've presented a number of different lacing methods on this site. This page compares both their visual and functional considerations feature by feature to help you choose.
Lacing Ratings
All of the lacing methods on this site have the facility for visitors to give them a rating from 1 to 5 stars. Here, you can view the results of those ratings and compare the popularity of the various methods.
Lacing Photos
Here you can see photos of all sorts of trendy shoes, laced with various lacing methods, that have been sent to me by site visitors. If you're after some lacing inspiration, this is the place!
If by "Lacing Shoes" you were actually looking for how to do up the lacing, see my Tying Shoelaces page instead.

Sponsored Links

Previous PageTop of PageNext Page
This page last updated: 20-Oct-2013. Copyright © 2003-2013 by Ian W. Fieggen. All rights reserved.

Also for iPhone



lol my thesis

SoftEther VPN Becomes Open Source - SoftEther VPN Project

$
0
0

Comments:"SoftEther VPN Becomes Open Source - SoftEther VPN Project"

URL:http://www.softether.org/9-about/News/800-open-source


January 4, 2014

By Daiyuu Nobori, SoftEther VPN Project at University of Tsukuba, Japan.

 

 

We are very happy to announce that the source code of SoftEther VPN is released as open-source software under the GPLv2 license. SoftEther VPN is the underlying VPN engine of VPN Gate. The source code is provided as packages in .tar.gz and .zip formats, and is also published on our GitHub repository. You can build the full SoftEther VPN programs from the source code in Windows, Linux, Mac OS X, FreeBSD or Solaris computers. You can also generate your own customized installer packages of SoftEther VPN automatically from the source code.

 

SoftEther VPN is a product-class VPN software suite with popularity as a tool to build on-premise or cloud-base VPNs. The binaries of SoftEther VPN were released on March 8, 2013. Since then, SoftEther VPN Server has been installed into over 80,000+ server computers in Japan, United States, China, Taiwan, Iran, Germany, United Kingdom, France, Korea, India and other 164 regions (*1). SoftEther VPN supports Windows, Mac, Linux, and smartphones including iPhone and Android. SoftEther VPN supports multiple VPN protocols including SSL-VPN, OpenVPN, IPsec, L2TP, MS-SSTP, L2TPv3 and EtherIP, by a single instance of VPN server program. Individual and corporate network administrators can replace their legacy OpenVPN or Cisco's VPN router products by SoftEther VPN for integration.

 

 

One of the popular applications of SoftEther VPN is VPN Gate (http://www.vpngate.net/). VPN Gate is "A Volunteer-Organized Public VPN Relay System with Blocking Resistance for Bypassing Government Censorship Firewalls"(*2). VPN Gate is a circumvention tool for bypassing governments' censorship firewalls. Many Internet users behind censorship firewall, including Chinese Great Firewall, are using VPN Gate to browse YouTube, Twitter and Facebook. Daily 110,000+ unique users (estimated by the number of client IP addresses) are using VPN Gate (*4). The offense and defense between VPN Gate and Chinese Great Firewall will be reported on USENIX NSDI 2014 International Conference (Seattle, April 2-4, 2014) (*3).

 

The text data of the source code of SoftEther VPN is approximately 380,000 lines. The total file size is 11 Mbytes. The source code includes not only the user-mode programs of SoftEther VPN, but also kernel-mode device driver codes for Virtual Network Adapter and Ethernet Bridging Module.

 

 

Many developers are now able to download the SoftEther VPN source code and study the technique how to design and implement a VPN protocol-engine to achieve the high-performance, multi-protocol support VPN communication with high-level penetrating-ability against firewalls. They can also study the know-how to implement kernel-mode device drivers which access to low-level Ethernet packet processing fabric on Windows and other modern operating systems.

 

Furthermore, because the source code is published under the traditional GPLv2 (GNU General Public License version 2), SoftEther VPN is granted to be modified, be recompiled, be embedded into derived software or hardware, or be redistributed with new branding, by any developers who have abilities to do so.

 

We believe that easy-to-use software-based VPN tools are necessary to achieve the free Internet world. Herein, the free Internet world means that any governments cannot censor or tap over the communication of people, and people can use communication technology without any kind of fears of suppression by governments. However, implementing such an easy-to-use VPN tool has required enormous difficult effort to implement the VPN engines. By using the source code of SoftEther VPN, any developer can exploit it to build his own VPN-based application. We hope that the release of SoftEther VPN source code will help such developers, and will also help to achieve the free Internet world in future.

 

 

*1

Current geographic locations of 81,424 SoftEther VPN Server users on January 4, 2014.
SoftEther VPN Server is installed on server computers around the world.

 

*2
The offense and defense between VPN Gate and Chinese Great Firewall will be reported on our academic paper: "VPN Gate: A Volunteer-Organized Public VPN Relay System with Blocking Resistance for Bypassing Government Censorship Firewalls" which was accepted on the USENIX NSDI 2014 International Conference (Seattle, April 2-4, 2014).
More details: https://www.usenix.org/conference/nsdi14/technical-sessions/presentation/nobori.

 

*3

The graph of number of daily unique source IP addresses of VPN Gate clients.

 

*4
The ranking table of VPN Gate client source locations.
More details in real time: http://www.vpngate.net/en/region.aspx.

 

Build the programs from the source code

To build from the source, see "BUILD_UNIX.TXT" or "BUILD_WINDOWS.TXT" files in the package, or see the following links.

Explore the latest source code tree from GitHub

We use GitHub as the primary official SoftEther VPN repository.

If you are an open-source developer, visit our GitHub repository:

https://github.com/SoftEtherVPN/SoftEtherVPN/

 

You can download the up-to-date source-code tree of SoftEther VPN from GitHub. You may make your own fork project from our project.

 

The download and build instruction is following:

git clone https://github.com/SoftEtherVPN/SoftEtherVPN.git
cd SoftEtherVPN
make
make install

 

To circumvent your government's firewall restriction

Because SoftEther VPN is overly strong tool to build a VPN tunnel, some censorship governments want to block your access to the source code of SoftEther VPN, by abusing their censorship firewalls.

To circumvent your censor's unjust restriction, SoftEther VPN Project distributes the up-to-date source-code on all the following open-source repositories:

 

To fetch the source code from GitHub:

git clone https://github.com/SoftEtherVPN/SoftEtherVPN.git

 

To fetch the source code from SourceForge:

git clone http://git.code.sf.net/p/softethervpn/code

  - or -

git clone git://git.code.sf.net/p/softethervpn/code

 

To fetch the source code from Google Code:

git clone https://code.google.com/p/softether/

 

We hope that you can reach one of the above URLs at least!

 

Introduction of SoftEther VPN

SoftEther VPN ("SoftEther" means "Software Ethernet") is one of the world's most powerful and easy-to-use multi-protocol VPN software. It runs on Windows, Linux, Mac, FreeBSD and Solaris.

SoftEther VPN is open source. You can use SoftEther for any personal or commercial use for free of charge.

SoftEther VPN is an optimum alternative to OpenVPN and Microsoft's VPN servers. SoftEther VPN has a clone-function of OpenVPN Server. You can integrate from OpenVPN to SoftEther VPN smoothly. SoftEther VPN is faster than OpenVPN. SoftEther VPN also supports Microsoft SSTP VPN for Windows Vista / 7 / 8. No more need to pay expensive Windows Server license-fee for Remote-Access VPN function.

SoftEther VPN can be used to realize BYOD (Bring your own device) on your business. If you have smartphones, tablets or laptop PCs, SoftEther VPN's L2TP/IPsec server function helps you to establish a remote-access VPN from remote to your local network. SoftEther VPN's L2TP VPN Server has strong compatible with WindowsMaciOS and Android.

SoftEther VPN is not only an alternative VPN server to existing VPN products (OpenVPN, IPsec and MS-SSTP). SoftEther VPN has also original strong SSL-VPN protocol to penetrate any kinds of firewalls. Ultra-optimized SSL-VPN Protocol of SoftEther VPN has very fast throughput, low latency and firewall resistance.

SoftEther VPN has strong resistance against firewalls than ever. Built-in NAT-traversal penetrates your network admin's troublesome firewall for overprotection. You can setup your own VPN server behind the firewall or NAT in your company, and you can reach to that VPN server in the corporate private network from your home or mobile place, without any modification of firewall settings. Any deep-packet inspection firewalls cannot detect SoftEther VPN's transport packets as a VPN tunnel, because SoftEther VPN uses Ethernet over HTTPS for camouflage.

Easy to imagine, design and implement your VPN topology with SoftEther VPN. It virtualizes Ethernet by software-enumeration. SoftEther VPN Client implements Virtual Network Adapter, and SoftEther VPN Server implements Virtual Ethernet Switch. You can easily build both Remote-Access VPN and Site-to-Site VPN, as expansion of Ethernet-based L2 VPN. Of course, traditional IP-routing L3 based VPN can be built by SoftEther VPN.

SoftEther VPN has strong compatibility to today's most popular VPN products. It has the interoperability with OpenVPN, L2TP, IPsec, EtherIP, L2TPv3, Cisco VPN Routers and MS-SSTP VPN Clients. SoftEther VPN is the world's only VPN software which supports SSL-VPN, OpenVPN, L2TP, EtherIP, L2TPv3 and IPsec, as a single VPN software.

SoftEther VPN is free software because it was developed as Daiyuu Nobori's Master Thesis research in the University. You can download and use it today. The source-code of SoftEther VPN is available under GPL license.

Git: the NoSQL Database // Speaker Deck

$
0
0

Comments:"Git: the NoSQL Database // Speaker Deck"

URL:https://speakerdeck.com/bkeepers/git-the-nosql-database


We all know that Git is amazing for storing code. It is fast, reliable, flexible, and it keeps our project history nuzzled safely in its object database while we sleep soundly at night.

But what about storing more than code? Why not data? Much flexibility is gained by ditching traditional databases, but at what cost?

Presented at <a href="http://devslovebacon.com/speakers/brandon-keepers">Bacon</a> and <a href="http://aloharubyconf.com/schedule">Aloha Ruby Conference</a>.

Produce the number 2014 without any numbers in your source code - Programming Puzzles & Code Golf Stack Exchange

$
0
0

Comments:"Produce the number 2014 without any numbers in your source code - Programming Puzzles & Code Golf Stack Exchange"

URL:http://codegolf.stackexchange.com/q/17005


This solution is courtesy of BrowserUK on PerlMonks, though I've shaved off some unnecessary punctuation and whitespace from the solution he posted. It's a bitwise "not" on a four character binary string.

say~"ÍÏÎË"

The characters displayed above represent the binary octets cd:cf:ce:cb, and are how they appear in ISO-8859-1 and ISO-8859-15.

Here's the entire script in hex, plus an example running it:

$ hexcat ~/tmp/ten-stroke.pl
73:61:79:7e:22:cd:cf:ce:cb:22
$ perl -M5.010 ~/tmp/ten-stroke.pl
2014

Update: altered to use say instead of print as per @PeterTaylor's comment. Not only does this shave off two further characters, it adds an attractive line break at the end of the output.

Perl - 16 (or even 14) characters

print'````'^RPQT

Using bitwise "or" on the two four-character strings "RPQT" and "````".

You can knock a couple of characters using die instead of print. However, this results in output to STDERR instead of STDOUT, and the output will be suffixed with the line number of the error. So I class that as a cheat.

(I initially had the two strings the other way around, which required whitespace between print and RPQT to separate the tokens. @DomHastings pointed out that by switching them around I could save a character.)

Indian GSLV successfully lofts GSAT-14 satellite | NASASpaceFlight.com

$
0
0

Comments:"Indian GSLV successfully lofts GSAT-14 satellite | NASASpaceFlight.com"

URL:http://www.nasaspaceflight.com/2014/01/indian-gslv-launch-gsat-14-communications-satellite/


January 4, 2014 by William Graham

India’s Geosynchronous Satellite Launch Vehicle (GSLV) has ended a run of four consecutive launch failures by deploying the GSAT-14 communications satellite on Sunday, following launch at 10:48 UTC. The mission – from the Second Launch Pad at the Satish Dhawan Space Centre – was a realigned attempt, following the scrub and rollback for repairs on the rocket last year.


Indian Launch Preview:

This launch was set to take place in August of last year. However, several problems – not least during its August 19 countdown, when its second stage began leaking large amounts of hydrazine fuel over the launch pad.

A lengthy delay followed, not least because the entire vehicle had been contaminated by the leak. As a result, the vehicle was rolled back and dismantled. It now sports two new stages and refurbished boosters, while the second stage is now utilizing aluminium alloy tankage.

The Geosynchronous Satellite Launch Vehicle, first flown in 2001, is the newest rocket in India’s fleet, designed to place communications satellites into geosynchronous transfer orbits. It is the fourth rocket to be developed by India, following the Satellite Launch Vehicle, Augmented Satellite Launch Vehicle and Polar Satellite Launch Vehicle.

India’s first orbital launch attempt took place on 10 August 1979, with a Satellite Launch Vehicle carrying the Rohini Technology Payload, or RTP. This launch failed to orbit after the rocket’s second stage thrust vector control system malfunctioned. The next launch on 18 July 1980 saw the SLV successfully orbit the Rohini RS-1 satellite.

Two more SLVs were launched; in May 1981 and April 1983, with Rohini RS-D1 and RS-D2 respectively. The 1981 launch was unsuccessful, with RS-D1 being placed into an unusable, rapidly decaying, low orbit, from which it reentered within nine days of launch. The SLV, which is also known as the SLV-3, retired from service with a record of two successes and two failures.

The SLV was replaced by the Augmented Satellite Launch Vehicle, which consisted of a similar core vehicle to the SLV-3, but with an additional first stage consisting of two more S-9 rocket motors. The S-9 was used as the first stage of the SLV, which became the ASLV’s second stage, and is still used as a booster rocket on some PSLV launches.

The first ASLV launched in March 1987, carrying the SROSS-A, or Stretched Rohini A, satellite. The second stage failed to ignite, and as a result the rocket was unable to achieve orbit. The next launch, which occurred in July 1988, fared no better, with the rocket’s attitude control system failing late in first stage flight.

The third ASLV reached low Earth orbit, however the incorrect spin stabilization of the rocket’s fifth stage resulted in the orbit being lower than had been planned, and the SROSS-C satellite could only return limited data for less than two months of a planned six month mission.

The fourth and final ASLV launch carried a replacement for SROSS-C; SROSS-C2. On this mission the ASLV performed successfully, deploying the satellite into its target orbit. SROSS-C2 was able to operate for four years – more than eight times its design life. Following the fourth launch, which took place on 4 May 1994, the rocket was retired in favor of the PSLV, which had made its first test flight the previous year.

The PSLV, or Polar Satellite Launch Vehicle, remains the workhorse of India’s space program. It has achieved 23 successful launches from 25 attempts since its maiden flight on 20 September 1993.

The maiden flight, which carried the IRS-1E satellite, remains the rocket’s only outright failure; the rocket’s attitude control system failed at second stage separation, with the vehicle unable to make orbit.

Following two successful launches, carrying IRS-P2 and IRS-P3 in 1994 and 1996 respectively, the PSLV was declared operational. The payload for the first operational mission was IRS-1D, which was destined for a sun-synchronous orbit.

PSLV C1, as the rocket was designated, lifted off from Sriharikota on 29 September 1997, however a fourth stage helium leak left the rocket unable to reach its target.

Instead, IRS-1D was placed into a lower-than-planned orbit. The satellite was able to reach a usable orbit, still somewhat lower than had initially been planned, at the expense of most of its own propellant supply.

The IRS-1D launch was the most recent failure of a PSLV; in the 20 launches since it has performed perfectly. Most of the PSLV’s flights have placed remote sensing satellites into sun-synchronous orbit; however it has been used for other launches.

Click here for other ISRO News Articles: http://www.nasaspaceflight.com/?s=ISRO

The seventh PSLV launch, in September 2002, carried the METSAT-1 weather satellite bound for geosynchronous orbit. METSAT-1 was later renamed Kalpana-1 after astronaut Kalpana Chawla, who was killed in the Columbia accident.

Two other launches have been made to geosynchronous transfer orbit; a communications satellite, GSAT-12, in 2011, and the IRNSS-1A navigation satellite last month. In October 2008, India used a PSLV to launch its first mission to the Moon, Chandrayaan-1.

The first commercial PSLV launch took place in April 2007, carrying the AGILE gamma-ray astronomy satellite for the Italian space agency. The next launch, in January 2008, orbited Israel’s TecSAR radar reconnaissance satellite. The PSLV has also launched two radar imaging satellites for the Indian military; RISAT-2 which was built with assistance from Israel, and later RISAT-1, which India developed independently.

A launch last year carried the SPOT-6 satellite for the French space agency, CNES, and two Franco-Indian scientific satellites, Megha-Tropiques and SARAL, have also been launched.

The most recent PSLV mission lofted the Mars Orbiter Mission (MOM) in November.

PSLV launches have carried a number of secondary payloads, including the SRE-1 satellite which was recovered after several days in orbit in 2007. In recent years, many CubeSats have found launch opportunities on the PSLV. One launch carried ten payloads – the most an Indian rocket has launched to date, although not the most of any launch by any country.

With the PSLV operational, India looked to develop a rocket capable of launching its communications satellites to geosynchronous orbit. While the PSLV has been able to launch geosynchronous satellites, it has only been able to place very small satellites into fairly low transfer orbits, whereas the GSLV can launch larger payloads into more typical, higher, transfer orbits.

The maiden flight of the GSLV was conducted on 20 April 2001, carrying an experimental communications satellite named GramSat-1, or GSAT-1. The first two stages performed well, however the third stage underperformed leaving the payload in a lower orbit than had been planned.

Despite attempts to recover the satellite, using its own propulsion system to make up the shortfall, a design fault stemming from the satellite having been partially constructed from spare parts led to it running out of fuel short of geostationary orbit.

After the initial failure, the next flight in May 2003 fared better, placing GSAT-2 into its planned transfer orbit. Following this, the GSLV was declared operational, and its third flight successfully orbited GSAT-3, also known as HealthSat in September 2004.

The fourth GSLV, launched in July 2006, was expected to place the INSAT-4C communications satellite into orbit. Before the rocket even launched, a thrust regulator in one of the four booster rockets failed, resulting in that booster producing more thrust than it was designed to withstand, which caused the engine to fail less than a second after launch.

The rocket flew on for around a minute before disintegrating as is approached the area of maximum aerodynamic pressure.

A replacement for INSAT-4C, INSAT-4CR, was carried by the fifth GSLV, F04, which flew in September 2007. This launch also failed to reach its target orbit – suffering a similar shortfall to the GSAT-1 mission, however unlike GSAT-1; INSAT-4CR was able to correct its own orbit.

The GSLV Mk.II, which features a new upper stage with an Indian-built engine, made its first launch in April 2010 with the GSAT-4 satellite as its payload.  While the first and second stages performed well, the new third stage engine failed 2.2 seconds after it ignited, and the rocket did not achieve orbit.

This failure has been attributed to a problem with the Fuel Boost Turbopump (FBTP), which appeared to lose speed a second after ignition. Following the failure, ISRO opted to conduct further tests on the new third stage, with two leftover GSLV Mk.Is flying in the interim.

The first of these rockets was launched on 25 December 2010 with GSAT-5P. Bound for geostationary transfer orbit, the rocket was destroyed by range safety 53 seconds after a loss of control.

An investigation determined that connectors in a Russian-built interstage adaptor had snapped, leaving the strap-on boosters uncontrollable, however Russian officials blamed a structural failure of the payload fairing.

It was later reported that problems with the connectors had occurred before – including one snapping during the launch of INSAT-4CR which was responsible for the underperformance of that launch.

Owing to the disagreement between India and Russia over the cause of the GSAT-5P failure, the final GSLV Mk.I has not yet flown. It is unclear whether it will ever be launched, or if ISRO will focus instead on the Mk.II. Because of these failures, currently the GSLV is statistically the least reliable rocket in service, with a success rate of 28.6%.

GSLV is a three-stage rocket, with four liquid-fuelled boosters augmenting the first stage. The first stage, or GS-1, is powered by an S-139 solid rocket motor, burning hydroxyl-terminated polybutadiene (HTPB) propellant. The stage can deliver up to 4,800 kilonewtons (1.1 million pounds) of thrust.

The four L40H boosters, which are powered by Vikas engines burning UH25 – a mixture of three parts unsymmetrical dimethylhydrazine and one part hydrazine hydrate, which is oxidized by dinitrogen tetroxide. The Vikas engine is derived from the French Viking engine, which was developed for the Ariane family of rockets. Each booster provides 680 kilonewtons, or 150,000 pounds-force of thrust.

The second stage, designated the GS-2 or L-37.5H, also uses a Vikas engine; delivering 720 kilonewtons (160,000 lbf) of thrust. The third stage, or GS-3, is a CUS-12 powered by the Indian Cryogenic Engine, or ICE. Burning liquid hydrogen propellant with liquid oxygen as an oxidiser, the ICE will deliver 75 kilonewtons (17,000 lbf) of thrust.

The launch began with the ignition of the four boosters, 4.8 seconds ahead of the planned liftoff time. The solid-fuelled core stage ignited at T-0 and burn for 100 seconds. Once it completed its burn, the first stage remained attached as the boosters burn for slightly longer than it does. Around 149 seconds after launch, the booster engines shut down, with the second stage igniting half a second later, and stage separation occurring two seconds after cutoff.

The second stage burned for 139.5 seconds. About 75 seconds into the burn, fairing separation occurred, with the shroud which protects GSAT-14 during its ascent through the atmosphere separating from the nose of the rocket. Once the second stage completes its firing, it coasted for three and a half seconds before separating.

The third stage ignited a second after staging, beginning a 12-minute, 1.5-second burn to reach the planned geosynchronous transfer orbit.

Spacecraft separation, which targeted an orbit of 180 by 35975 kilometers (112 by 22,354 statute miles, 97 by 19,425 nautical miles) with an inclination of 19.3 degrees, occurred 13 seconds after the end of the third stage burn – seventeen minutes and eight seconds after liftoff.

Allowable error margins for the launch are plus or minus 5 kilometers (3.1 mi, 2.7 nmi) in perigee altitude, 675 kilometers (420 mi, 365 nmi) apogee altitude, and a tenth of a degree inclination.

Compared to previous launches, GSLV D5 incorporates several modifications intended to increase its reliability.

The interstage between the second and third stages was redesigned to allow it to handle greater loads, while the tunnel containing electrical connections between the stages has also been made more durable. The FBTP has been modified to allow it to perform better at the low temperatures it is expected to operate under.

The flight’s aerodynamic profile and third stage ignition sequence were also adjusted. In addition, the rocket carried cameras for the first time, to record its operation.

GSAT-14 is a 1,982 kilogram (4,370 lb) satellite, which was constructed by ISRO and is based on the I-2K bus. It is equipped with six C and six Ku-band transponders, powered by twin solar arrays which generate up to 2,600 watts of power and charge lithium ion batteries. In addition to its communications payload, the satellite carries two Ka-band payloads which will be used for an investigation of how weather affects satellite communications.

GSAT-14 will be positioned at a longitude of 74 degrees east, and is expected to operate for at least 12 years. Most of its mass is fuel, much of which will be expended by maneuvers to raise itself from the initial transfer orbit into geostationary orbit. It has a dry mass of 851 kilograms (1,876 lb).

The Satish Dhawan Space Center, located in Sriharikota, India, has been the site of all of India’s orbital launches. Originally known as the Sriharikota High Altitude Range, or Sriharikota Range, it was named after ISRO’s second chairman, Satish Dhawan, following his death in 2002. The launch took place from the Second Launch Pad at the center.

The somewhat confusingly named Second Launch Pad (SLP) at the Satish Dhawan Space Centre is actually the fifth launch complex to be built at the site – following a sounding rocket complex to the north, disused SLV and ASLV complexes to the south, and the nearby First Launch Pad.

The GSLV can launch from either the First Launch Pad, which was built in the 1990s for the PSLV, or from the Second Launch Pad. Since the completion of the Second pad, all GSLV launches have used it. D5 is the fifth GSLV and twelfth rocket overall to fly from the Second Launch Pad.

The Second Launch Pad was constructed in the early 2000s, and first used for a PSLV launch in May 2005, with the CartoSat-1 satellite. Like the First Launch Pad both the PSLV and GSLV can launch from it.

Rockets are assembled vertically in an integration building some distance from the pad, and then moved to the launch pad atop a mobile platform running on rails.

(Images via ISRO).

Share This Article


More About Unicode in Python 2 and 3 | Armin Ronacher's Thoughts and Writings

$
0
0

Comments:"More About Unicode in Python 2 and 3 | Armin Ronacher's Thoughts and Writings"

URL:http://lucumr.pocoo.org/2014/1/5/unicode-in-2-and-3/


written on Sunday, January 5, 2014

It's becoming increasingly harder to have reasonable discussions about the differences between Python 2 and 3 because one language is dead and the other is actively developed. So when someone starts a discussion about the Unicode support between those two languages it's not an even playing field. So I won't discuss the actual Unicode support of those two languages but the core model of how to deal with text and bytes in both.

I will use this post to show that from the pure design of the language and standard library why Python 2 the better language for dealing with text and bytes.

Since I have to maintain lots of code that deals exactly with the path between Unicode and bytes this regression from 2 to 3 has caused me lots of grief. Especially when I see slides by core Python maintainers about how I should trust them that 3.3 is better than 2.7 makes me more than angry.

The Text Model

The main difference between Python 2 and Python 3 is the basic types that exist to deal with texts and bytes. On Python 3 we have one text type: str which holds Unicode data and two byte types bytes and bytearray.

On the other hand on Python 2 we have two text types: str which for all intents and purposes is limited to ASCII + some undefined data above the 7 bit range, unicode which is equivalent to the Python 3 str type and one byte type bytearray which it inherited from Python 3.

Looking at that you can see that Python 3 removed something: support for non Unicode data text. For that sacrifice it gained a hashable byte type, the bytes object. bytearray is a mutable type, so it's not suitable for hashing. I very rarely use true binary data as dictionary keys though so it does not show up as big problem. Especially not because in Python 2, you can just put bytes into the str type without issues.

The Lost Type

Python 3 essentially removed the byte-string type which in 2.x was called str. On the paper there is nothing inherently wrong with it. From a purely theoretical point of view text always in Unicode sounds awesome. And it is. If your whole world is just your interpreter. Unfortunately that's not how it works in the real world where you need to interface with bytes and different encodings on a regular basis and for that, the Python 3 model completely breaks down.

Let me be clear upfront: Python 2's way of dealing with Unicode is error prone and I am all in favour of improving it. My point though is that the one in Python 3 is a step backwards and brought so many more issues that I absolutely hate working with it.

Unicode Errors

Before I go into the details, we need to understand what the differences of the Unicode support in Python 2 and 3 is, and why the decision was made to change it.

Python 2, like many languages before it, was created without support for dealing with strings of different encodings. A string was a string and it contained bytes. It was up to the developer to properly deal with different encodings manually. This actually works remarkably fine for many situations. The Django framework for many years did not support Unicode at all and used the byte-string interface in Python exclusively.

Python 2 however also gained better and better support for Unicode internally over the years and through this Unicode support it gained support for different encodings to represent that Unicode data.

In Python 2 the way of dealing with strings of a specific encoding was actually remarkably simple when it started out. You took a string you got from somewhere (which was a byte-string) and decoded it from the encoding you got from a side-channel (header data, metadata etc., specification) into an Unicode string. Once it was an Unicode string, it supported the same operations as a regular byte-string but it supported a much larger character range. When you needed to send that string elsewhere for processing you usually encoded it back into an encoding that the other system can deal with and it becomes a byte-string again.

So what were the issues with that? At the core this worked, unfortunately Python 2 needed to provide a nice migration path from the non-Unicode into the Unicode world. This was done by allowing coercion of byte-strings and non byte-strings. When does this happen and how does it work?

Essentially when you have an operation involving a byte-string and a Unicode-string, the byte-string is promoted into a Unicode string by going through an implicit decoding process that uses the “default encoding” which is set to ASCII. Python did provide a way to change this encoding at one point, but nowadays the site.py module removes the function to set this encoding after it sets the encoding to ASCII. If you start Python with the -S flag the sys.setdefaultencoding function is still there and you can experiment what happens if you set your Python default encoding to UTF-8 for instance.

So here are some situations where the default encoding kicks in:

Implicit encoding upon string concatenation:>>> "Hello " + u"World" u'Hello World' Here the string on the left is decoded by using the default system encoding into a Unicode string. If it would contain non-ASCII characters this normally blow up with an UnicodeDecodeError because the default encoding is set to ASCII. Implicit encoding through comparison: This sounds more evil as it is. Essentially it decodes the left side to Unicode and then compares. In case the left side cannot be decoded it will warn and return False. This is actually surprisingly sane behavior even though it sounds insane at first. Implicit decoding as part of a codec. This one is an evil one and most likely the source of all confusion about Unicode in Python 2. Confusing enough that Python 3 took the absolutely insanely radical step and removed .decode() from Unicode strings and .encode() from byte strings and caused me major frustration. In my mind this was an insanely stupid decision but I have been told more than once that my point of view is wrong and it won't be changed back. The implicit decoding as part of a codec operation looks like this:>>> "foo".encode('utf-8') 'foo' Here the string is obviously a byte-string. We ask it to encode to UTF-8. This by itself makes no sense because the UTF-8 codec encodes from Unicode to UTF-8 bytes. So how does this work? It works because the UTF-8 codec sees that the object is not a Unicode string and first performs a coercion to Unicode through the default codec. Since "foo" is ASCII only and the default encoding is ASCII this coercion will succeed and then the resulting u"foo" string will be encoded through UTF-8.

Codec System

So you now know that Python 2 has two ways to represent strings: in bytes and in Unicode. The conversion between those two happens by using the Python codec system. However the codec system does not enforce that a conversion always needs to take place between Unicode and bytes or the other way round. A codec can implement a transformation between bytes and bytes and Unicode and Unicode. In fact, the codec system itself can implement a conversion between any Python type. You could have a JSON codec that decodes from a string into a complex Python object if you so desire.

That this might cause issues at one point has been understood from the very start. There is a codec called 'undefined' which can be set as default encoding in which case any string coercion is disabled:

>>> importsys>>> sys.setdefaultencoding('undefined')>>> "foo"+u"bar"Traceback (most recent call last):raiseUnicodeError("undefined encoding")UnicodeError: undefined encoding

This is implemented as a codec that raises errors for any operation. The sole purpose of that module is to disable the implicit coercion.

So how did Python 3 fix this? Python 3 removed all codecs that don't go from bytes to Unicode or vice versa and removed the now useless .encode() method on bytes and .decode() method on strings. Unfortunately that turned out to be a terrible decision because there are many, many codecs that are incredibly useful. For instance it's very common to decode with the hex codec in Python 2:

>>> "\x00\x01".encode('hex')'0001'

While you might argue that this particular case can also be handled by a module like binascii, there is a deeper problem with that which is that the codec module is also separately available. For instance libraries implementing reading from sockets used the codec system to perform partial decoding of zlib streams:

>>> importcodecs>>> decoder=codecs.getincrementaldecoder('zlib')('strict')>>> decoder.decode('x\x9c\xf3H\xcd\xc9\xc9Wp')'Hello '>>> decoder.decode('\xcdK\xceO\xc9\xccK/\x06\x00+\xad\x05\xaf')'Encodings'

This was eventually recognized and Python 3.3 restored those codecs. Now however we're in the land of user confusion again because these codecs don't provide the meta information before the call about what types they can deal with. Because of this you can now trigger errors like this on Python 3:

>>> "Hello World".encode('zlib_codec')Traceback (most recent call last):
 File "<stdin>", line 1, in <module>TypeError: 'str' does not support the buffer interface

(Note that the codec is now called zlib_codec instead of zlib because Python 3.3 does not have the old aliases set up.)

So given the current state of Python 3.3, what exactly would happen if we would get the .encode() method on byte strings back for instance? This is easy to test, even without having to hack the Python interpreter. Let's just settle for a function with the same behavior for the moment:

importcodecsdefencode(s,name,*args,**kwargs):codec=codecs.lookup(name)rv,length=codec.encode(s,*args,**kwargs)ifnotisinstance(rv,(str,bytes,bytearray)):raiseTypeError('Not a string or byte codec')returnrv

Now we can use this as replacement for the .encode() method we had on byte strings:

>>> b'Hello World'.encode('latin1')Traceback (most recent call last):
 File "<stdin>", line 1, in <module>AttributeError: 'bytes' object has no attribute 'encode'>>> encode(b'Hello World','latin1')Traceback (most recent call last):
 File "<stdin>", line 4, in encodeTypeError: Can't convert 'bytes' object to str implicitly

Oha! Python 3 can already deal with this. And we get a nice error. I would even argue that “Can't convert 'bytes' object to str implicitly” is a lot nicer than “'bytes' object has no attribute 'encode'”.

Why do we still not have those encoding methods back? I really don't know and I no longer care either. I have been told multiple times now that my point of view is wrong and I don't understand beginners, or that the “text model” has been changed and my request makes no sense.

Byte-Strings are Gone

Aside from the codec system regression there is also the case that all text operations now are only defined for Unicode strings. In a way this seems to make sense, but it does not really. Previously the interpreter had implementations for operations on byte strings and Unicode strings. This was pretty obvious to the programmer as custom objects had to implement both __str__ and __unicode__ if they wanted to be formatted into either. Again, there was implicit coercion going on which confused newcomers, but at least we had the option for both.

Why was this useful? Because for instance if you write low-level protocols you often need to deal with formatting numbers out into byte strings.

Python's own version control system is still not on Python 3 because for years now because the Python team does not want to bring back string formatting for bytes.

This is getting ridiculous now though, because it turned out that the model chosen for Python 3 just does not work in reality. For instance in Python 3 the developers just “upgraded” some APIs to Unicode only, making them completely useless for real-world situations. For instance you could no longer parse byte only URLs with the standard library, the implicit assumption was that every URL as Unicode (for that matter, you could not handle non-Unicode mails any more either, completely ignoring that binary attachments exist).

This was fixed obviously, but because byte strings are gone now, the URL parsing library ships two implementations now. One for Unicode strings and one for byte objects. Two implementations behind the same function though, just the return value is vastly different now:

>>> fromurllib.parseimporturlparse>>> urlparse('http://www.google.com/')ParseResult(scheme='http', netloc='www.google.com', path='/', params='', query='', fragment='')>>> urlparse(b'http://www.google.com/')ParseResultBytes(scheme=b'http', netloc=b'www.google.com', path=b'/', params=b'', query=b'', fragment=b'')

Looks similar? Not at all, because they are made of different types. One is a tuple of strings, the other is more like an array of integers. I have written about this before already and it still pains me. It makes writing code for Python incredibly frustrating now or hugely inefficient because you need to go through multiple encode and decode steps. Aside from that, it's really hard to write fully functional code now. The idea that everything can be Unicode is nice in theory, but totally not applicable for the real world.

Python 3 is riddled with weird workarounds now for situations where you cannot use Unicode strings and for someone like me, who has to deal with those situations a lot, it's ridiculously annoying.

Our Workarounds Break

The Unicode support in 2.x was not perfect, far from it. There was missing APIs and problems left and right, but we as programmers made it work. Unfortunately many of the ways in which we made it work, do not transfer well to Python 3 any more and some of the APIs would have had to have been changed to work well on Python 3.

My favourite example now is the file streams which like before are either text or bytes, but there is no way to reliably figure out which one is which. The trick which I helped to popularize is to read zero bytes from the stream to figure out of which type it is. Unfortunately those workarounds don't work reliably either. For instance passing a urllib request object to Flask's JSON parse function breaks on Python 3 but works on Python 2 as a result of this:

>>> fromurllib.requestimporturlopen>>> r=urlopen('https://pypi.python.org/pypi/Flask/json')>>> fromflaskimportjson>>> json.load(r)Traceback (most recent call last):
 File "decoder.py", line 368, in raw_decodeStopIterationDuring handling of the above exception, another exception occurred:Traceback (most recent call last):
 File "<stdin>", line 1, in <module>ValueError: No JSON object could be decoded

The Outlook

There are many more problems with Python 3's Unicode support than just those. I started unfollowing Python developers on Twitter because I got so fed up with having to read about how amazing Python 3 is which is in such conflict with my own experiences. Yes, lots of things are cool in Python 3, but the core flow of dealing with Unicode and bytes is not.

(The worst of all of this is that many of the features in Python 3 which are genuinely cool could just as well work on Python 2 as well. Things like yield from, nonlocal, SNI SSL support etc.)

In light of only about 3% of all Python developers using Python 3 properly and developers proudly declaring on Twitter that “the migration is going as planned” I got so incredibly frustrated that I nearly published an multi page rant about my experience with Python 3 and how we should kill it.

I won't do that now but I do wish Python 3 core developers would become a bit more humble. For 97% of us, Python 2 is our beloved world for years to come and telling us constantly about how amazing Python 3 is not just painful, it's also wrong in light of the many regressions. With people starting to discuss Python 2.8, a Stackless Python 2.8 coming up and these bad usage numbers, I don't know what failure is, if not that.

This entry was taggedpython and thoughts

Why the world needs OpenStreetMap - emacsen.net

$
0
0

Comments:"Why the world needs OpenStreetMap - emacsen.net"

URL:http://blog.emacsen.net/blog/2014/01/04/why-the-world-needs-openstreetmap/


Every time I tell someone about OpenStreetMap, they inevitably ask “Why not use Google Maps?”. From a practical standpoint, it’s a reasonable question, but ultimately this is not just a matter of practicality, but of what kind of society we want to live in. I discussed this topic in a 2008 talk on OpenStreetMap I gave at the first MappingDC meeting. Here are many of same concepts, but expanded.

In the 1800s, people were struggling with time, not how much of it they had, but what time it was. Clocks existed, but every town had its own time, “Local time”, which was synchronized by town clocks, or more often than not, church bells. Railway time, then eventually Greenwich Mean Time eventually supplanted all local time, and most people today don’t think about time as anything but universal. This was accomplished in the US by adoption first of the railroads, and then by universities and large businesses.

The modern day time dilemma is geography, and everyone is looking to be the definitive source. Google spends $1 billion annually maintaining their maps, and that does not include the 1.5 billion Google spent buying Waze. Google is far from the only company trying to own everywhere, as Nokia purchased Navtek and TomTom and Tele Atlas try to merge. All of these companies want to become the definitive source of what’s on the ground.

That’s because what’s on the ground has become big business. With GPSes in every car, and a smartphone in every pocket, the market for telling you where you are and where to go has become fierce.

With all these companies, why do we need a project like OpenStreetMap? The answer is simply that as a society, no one company should have a monopoly on place, just as no one company had a monopoly on time in the 1800s. Place is a shared resource, and when you give all that power to a single entity, you are giving them the power not only to tell you about your location, but to shape it. In summary, there are three concerns- who decides what gets shown on the map, who decides where you are and where you should go, and personal privacy.

Who decides what gets displayed on a Google Map? The answer is, of course, that Google does. I heard this concern in a meeting with a local government in 2009- they were concerned about using Google Maps on their website because Google makes choices about which businesses to display. They were right to be concerned about this issue since a government needs to remain impartial and by outsourcing their maps, they hand the control over to a third party.

It seems inevitable that Google will monetize geographic searches, with either premium results, or priority ordering, if they haven’t done so already (ie is it a coincidence than when I search for “breakfast” near my home, the first result is “SUBWAY® Restaurants”).

Of course Google is not the only map provider, they’re just one example. The point is that when you use any map provider, you are handing them the controls- letting them determine what features get emphasized, or what features may not be displayed at all.

The second concern is about location. Who defines where a neighborhood is, or whether or not you should go. This issue was brought up by theACLU, where a map provider was providing routing (driving/biking/walking instructions) and used what they determined to be safe or dangerous neighborhoods as part of their algorithm. This begs the question of who determines what makes a neighborhood safe or not, or whether safe is merely a codeword for something more sinister.

Right now, Flickr collects neighborhood information based on photographs which they expose through an API. They use this information to suggest tags for your photograph, but it would be possible to use neighborhood boundaries in a more subtle way in order to effect anything from traffic patterns to real estate prices, because when a map provider becomes large enough, they become the source of “truth”.

Lastly, these map providers have an incentive to collect information about you in ways that you may not agree with. Both Google and Apple collect your location information when you use their services. They can use this information to improve their map accuracy, but Google has already announced that is going to use this information to track the correlation between searches and where you go. With 500 million Android phones, this is an enormous amount of information collected on the individual level about people’s habits whether they’re taking a casual stroll, commuting to work, going to their doctor, or maybe attending a protest. Certainly we can’t ignore the societal implication of so much data in the hands of a single entity, no matter how benevolent they claim to be. Companies like Foursquare use gamification to overlay what is essentially a large scale data collection process, and even Google has gotten into the game of gamification with Ingress, a game which overlays an artificial world onto this one and encourages users to collect routing data and photo mapping as part of effort to either fight off, or encourage, an alien invasion.

Now that we have identified the problems, we can examine how OpenStreetMap solves each of them.

In terms of map content, OpenStreetMap is both neutral and transparent. OpenStreetMap is a wiki-like map that anyone in the world can edit. If a store is missing from the map, it can be added in, by a store owner or even a customer. In terms of display (rendering), each person or company who creates a map is free to render it how they like, but the main map on OpenStreetMap.org uses FLOSS (Free/Libre Open Source Software) rendering software and a liberally licensed stylesheet which anyone can build on. In other words, someone who cares can always create their own maps based on the same data.

Similarly, while the most popular routers for OpenStreetMap are FLOSS, even if a company chooses another software stack, a user is always free to use their own routing software, and it would be easy to compare routing results based on the same data to find anomalies.

And lastly, with OpenStreetMap data, a user is free to download some, or all of the map offline. This means that it’s possible to use OpenStreetMap data to navigate without giving your location away to anyone at all.

OpenStreetMap respects communities and respects people. If you’re not already contributing to OSM, consider helping out. If you’re already a contributor- Thank You.

Update Wow, this post has hit #1 on Hacker News and #2 on /r/technology on Reddit! Thanks all! Redditors, do you know about /r/openstreetmap?


ConvNetJS: Deep Learning in your browser

$
0
0

Comments:"ConvNetJS: Deep Learning in your browser"

URL:http://cs.stanford.edu/people/karpathy/convnetjs/


ConvNetJS is a Javascript library for training Deep Learning models (mainly Neural Networks) entirely in your browser. Open a tab and you're training. No software requirements, no compilers, no installations, no GPUs, no sweat.

Short Intro

Several large companies (Google, Facebook, Microsoft, Baidu) now use Deep Learning models for various Machine Learning tasks, most notably and successfully speech and image recognition, and slowly natural language processing. Read more.

Deep Learning is about stacking different types representation transformers (layers) on top of each other. Like when you make a sandwich. Unlike a sandwich however, each layer accepts a volume of numbers (we like to call them activations since we think of each number as a firing rate of a neuron) and transforms it into a different volume of numbers using some set of internal parameters (we like to think of those as trainable synapses).

In the simplest and most traditional setup, the first volume of activations represents your input and the last volume represents probabilities of the input volume being among any one of several distinct classes. During training you provide the network many examples of pairs of (input volume, class) and the network tunes its parameters ("learns") to transform inputs into correct class probabilities.

Here's an MNIST digits example: suppose we have an image (a 28x28 array of pixel values) that contains a 4. We create a 2D volume of size (28,28) and fill it with the pixel values. Then we pipe the input volume through the network and get output volume of 10 numbers representing the probability that the input is any one of 10 different digits:

So we transformed the original image into probabilities (taking many intermediate forms of representation along the way (that's the gray boxes in my artistic rendering)). But wait, it looks like the network only assigned 10 percent probability to this input being a 4, and 60 percent to it being a 9! That's fine: by design the mapping from input to output is just a mathematical function that is parameterized by a bunch of numbers and we can tune these parameters to make the network slightly more likely to give class 4 a higher probability for this particular input in the future.

The details require a bit of calculus, but you basically take pen and paper, write down the expression for the probability of digit 4 for this particular input (this is what we wish to increase) and derive the gradient with respect to all of network's parameters using the chain rule. Next, write up some code to compute the gradient and nudge all the parameters slightly along it. Some people like to call this procedure backpropagation (or backprop) but it's really just stochastic gradient descent, a vanilla method in optimization literature.

The amount we nudge is called the learning rate, and is perhaps the single most important number in training these networks. If it's too high, the networks explode with NaNs (I call it a NaN seizure :\), and if it's too low the training will take a very long time. Usually you start it higher (for example say 0.1) and anneal it slowly over time a few orders of magnitude (down to 0.0001 perhaps).

In any case, what it comes down to is that after the nudge the network will be a tiny bit more likely to predict a 4 on this image. So we just start from some random parameters, repeat this procedure for tens of thousands of different digits and BAM! The network gradually transforms itself into a digit classifier.

Example use

Lets create a simple 2-hidden-layer neural network classifier and pipe a random volume through:

 // 4 layers: 
 // input layer of size 1x1x2 (all volumes are 3D)
 // two fully connected layers of neurons
 // a softmax classifier predicting probabilities for two classes: 0,1
 var layer_defs = [];
 layer_defs.push({type:'input', out_sx:1, out_sy:1, out_depth:2});
 layer_defs.push({type:'fc', filters:20});
 layer_defs.push({type:'fc', filters:10});
 layer_defs.push({type:'softmax', num_classes:2});
 // create a net out of it
 var net = new convnetjs.Net();
 net.makeLayers(layer_defs);
 // create a little 2D volume of size 1x1x2 and pipe it through
 var x = new convnetjs.Vol(1,1,2);
 // set takes x,y,d and a value and sets it in the volume
 x.set(0,0,0,0.5);
 x.set(0,0,1,-1.3);
 var probability_volume = net.forward(x);
 console.log('probability that x is class 1: ' + probability_volume.get(0,0,1));
 // prints 0.50101

So we see that the network (which is initialized randomly) assigns probability 50.1% to the point [0.5, -1.3] being class 1. Lets now actually provide this as data to the network, saying x should in fact map to 1 with a high probability. We will use a built in SGDTrainer class:

 var trainer = new convnetjs.SGDTrainer(net, 
 {learning_rate:0.01, momentum:0.0, batch_size:1, decay:0.001});
 trainer.train(x, 1);
 var probability_volume2 = net.forward(x);
 console.log('probability that x is class 1: ' + probability_volume2.get(0,0,1));
 // prints 0.50374

The trainer takes a whole bunch of parameters, but for now just notice that once it backpropagated the information that x is in fact class 1, the network adjusts its parameters to make that more likely. Now you just need a dataset and some CPU time to call trainer.train(x,y) in turns :)

Layers

There are several layers currently available. Your first layer should be input, your last layer softmax (classifier). If you're not dealing with images, you probably want to stack Fully Connected layers, possibly with Dropout layers after layers with many activations for regularization. If you're dealing with images, you usually stack layers of convolutional, pooling and normalization layers. It is common to transition to fully connected layers near the end before the classifier as well.

Input Layer

{type:'input', out_sx:24, out_sy:24, out_depth:1}

A dummy layer that essentially declares the size of input volume and must be first layer in the network. Here, we are declaring the input will be a volume of size 24x24x1. If you don't have images but have D-dimensional points as input, use 1x1xD.

Convolution Layer

{type:'conv', sx:5, filters:8, stride:1}

Creates a layer that performs convolutions with 8 different 5x5 filters. They are applied densely (stride = 1) with no skips. If size of input volume W1xH1xD1, then size of output volume is floor(W1/stride)xfloor(H1/stride)xD2. Border conditions are handled by padding with zeros.

Pooling Layer

{type:'pool', sx:3, stride:2}

Performs max pooling in 3x3 regions (max pooling happens over depth only). In the above example, 4 numbers will be maxed and only one will be passed on. If size of input volume W1xH1xD1, then size of output volume is floor(W1/stride)xfloor(H1/stride)xD1.

Softmax Layer

{type:'softmax', num_classes:10}

Classifier layer. Currently should be the last layer in network. In this example we are declaring that we have 10 classes, [0..9]

Dropout Layer

{type:'dropout', drop_prob:0.5}

Implements dropout regularization and can be used after layers that have very large volumes for regularization and to prevent overfitting. The higher drop_prob is (up to 1), the more agressivily the layer is regularizing.

Local Contrast Normalization Layer

{type:'lrn', k:1, n:3, alpha:0.1, beta:0.75}

Local contrast normalization according to this formula, but 1 is replaced with k. It creates local competition among neurons along depth, independently at every particular location in the input volume.

Fully Connected Layer

{type:'fc', filters:20}

Declares a layer that has connections to all neurons in the previous layer (fully connected) with 20 output neurons. Every neuron computes a dot product and thresholds at 0 (referred to as Rectified Linear Unit, or ReLU). This is a more modern equivalent to Sigmoid units that have historically been more common but have recently fallen out of favor compared to ReLUs that give similar performance and train much faster.

About

This project was initially started by @karpathy for amusement. I am a PhD student at Stanford studying Machine Learning and Computer Vision and I've worked on Deep Learning both as part of my research and as an intern at Google (multiple times). In early versions of this code I chose to go first for simplicity, core concepts and most common use cases, though many bells and whistles can be added over time to add modeling flexibility and improve training times.

In general, people in the field are eager to make neural nets bigger and faster so Javascript may seem like a strange choice, but I do believe that aside from my own amusement there are multiple potentially interesting uses:

  • Browsers (and Javascript) are ubiqutous. This vastly increases the accessibility of these models to people who wish to quickly learn and tinker.
  • The browser is a beautiful, powerful and mature UI platform that can be used to easily visualize components of the network and how they work.
  • There are potential educational/training uses. I'm hoping a machine learning teacher somewhere might be more inclined to give out a deep learning assignment/homework using the library, as it is so trivial to get the code running on their student's computers.
  • Deep learning models that are pretrained for months on GPUs can be loaded through JSON and applied in browsers. Models can be quantized/compressed to improve loading times if necessary.
  • Imagine a vastly distributed Downpour SGD training system for neural nets where every client just has to keep a tab open, and the tab communicates gradients to a central server. It is possible to get thousands of people to collaborate on training a huge neural net just by visiting a webpage? :)
The downside, of course, is that we're paying some in efficiency but Chrome's V8 engine has so far completely shattered my expectations in what it is capable of and there are technologies on the horizon that can potentially give us additional performance boosts in the near future.

The Stanford NLP (Natural Language Processing) Group

$
0
0

Comments:"The Stanford NLP (Natural Language Processing) Group"

URL:http://nlp.stanford.edu/projects/arabic.shtml


 

The Stanford Natural Language Processing Group

home· people· teaching· research· publications· software· events· local

Arabic Natural Language Processing

Overview

Arabic is the largest member of the Semitic language family and is spoken by nearly 500 million people worldwide. It is one of the six official UN languages. Despite its cultural, religious, and political significance, Arabic has received comparatively little attention by modern computational linguistics. We are remedying this oversight by developing tools and techniques that deliver state-of-the-art performance in a variety of language processing tasks. Machine translation is our most active area of research, but we have also worked on statistical parsing and part-of-speech tagging. This page provides links to our freely available software along with a list of relevant publications.


Software

  • Stanford Arabic Parser - The full distribution includes a model trained on the most recent releases of the first three parts of the Penn Arabic Treebank (ATB). These corpora contain newswire text. Arabic-specific parsing instructions, a FAQ, and a recommended train/dev/test split of the ATB are also available. The parser expects segmented text as input. If you want to parse raw text, then you must pre-process it with the Stanford Arabic Word Segmenter.
  • Stanford Arabic Word Segmenter - Apply ATB clitic segmentation and orthographic normalization to raw Arabic text. The segmenter is based on a conditional random fields (CRF) sequence classifier so decoding is very fast. This segmenter is appropriate for processing large amounts of text (like machine translation corpora).
  • Stanford Arabic Part of Speech Tagger - The full distribution comes with a model trained on the ATB.
  • Tregex/TregexGUI - A regular expression package for parse trees. Useful for browsing and searching the ATB. Supports Unicode (UTF-8) input and display.

People


Papers

Below is a list of our publications that either deal with Arabic directly or use it as an experimental subject.

  • Spence Green, Sida Wang, Daniel Cer, and Christopher D. Manning. 2013. Fast and Adaptive Online Training of Feature-Rich Translation Models. In ACL. [pdf]
  • Spence Green, Marie-Catherine de Marneffe, and Christopher D. Manning. 2013. Parsing Models for Identifying Multiword Expressions. In Computational Linguistics. [pdf]
  • Spence Green and John DeNero. 2012. A Class-Based Agreement Model for Generating Accurately Inflected Translations. In ACL. [pdf]
  • Spence Green and Christopher D. Manning. 2010. Better Arabic Parsing: Baselines, Evaluations, and Analysis. In COLING. [pdf]
  • Spence Green, Michel Galley, and Christopher D. Manning. 2010. Improved Models of Distortion Cost for Statistical Machine Translation. In NAACL-HLT 2010. [pdf]
  • Spence Green, Conal Sathi, and Christopher D. Manning. 2009. NP subject detection in verb-initial Arabic clauses. In Proceedings of the Third Workshop on Computational Approaches to Arabic Script-based Languages (CAASL3). [pdf]
  • Michel Galley, Spence Green, Daniel Cer, Pi-Chuan Chang, Christopher D. Manning (2009). Stanford University's Arabic-to-English Statistical Machine Translation System for the 2009 NIST Evaluation. The 2009 NIST Open Machine Translation Evaluation Meeting. Ottawa, Canada. [pdf]
  • Michel Galley and Christopher D. Manning. 2008. A Simple and Effective Hierarchical Phrase Reordering Model. In ACL. [pdf]

Contact Information

For more information, please contact Spence Green at


Article 31

Want Perfect Pitch? You Might Be Able To Pop A Pill For That : NPR

$
0
0

Comments:"Want Perfect Pitch? You Might Be Able To Pop A Pill For That : NPR"

URL:http://www.npr.org/2014/01/04/259552442/want-perfect-pitch-you-could-pop-a-pill-for-that?sc=ipad&f=1001


hide captionJazz singer Ella Fitzergerald was said to have perfect pitch.

Klaus Frings/AP

Jazz singer Ella Fitzergerald was said to have perfect pitch.

Klaus Frings/AP

In the world of music, there is no more remarkable gift than having perfect pitch. As the story goes, Ella Fitzgerald's band would use her perfect pitch to tune their instruments.

Although it has a genetic component, most believe that perfect pitch — or absolute pitch — is a primarily a function of early life exposure and training in music, says Takao Hensch, professor of molecular and cellular biology at Harvard.

Hensch is studying a drug which might allow adults to learn perfect pitch by recreating this critical period in brain development. Hensch says the drug, valprioc acid, allows the brain to absorb new information as easily as it did before age 7.

"It's a mood-stabilizing drug, but we found that it also restores the plasticity of the brain to a juvenile state," Hensch tells NPR's Linda Wertheimer.

Hensch gave the drug to a group of healthy young men who had no musical training as children. They were asked to perform tasks online to train their ears, and at the end of a two-week period, tested on their ability to discriminate tone, to see if the training had more effect than it normally would at their age.

In other words, he gave people a pill and then taught them to have perfect pitch. The findings are significant: "It's quite remarkable since there are no known reports of adults acquiring absolute pitch," he says.

Interview Highlights

On whether the drug could be used to teach other skills

There are a number of examples of critical-period type development, language being one of the most obvious ones. So the idea here was, could we come up with a way that would reopen plasticity, [and] paired with the appropriate training, allow adult brains to become young again?

On the likelihood of the drug becoming common for learning new languages

I think we are getting closer to this day, because we are able to understand at greater cellular detail how the brain changes throughout development. But I should caution that critical periods have evolved for a reason, and it is a process that one probably would not want to tamper with carelessly.

If we've shaped our identities through development, through a critical period, and have matched our brain to the environment in which we were raised, acquiring language, culture, identity, then if we were to erase that by reopening the critical period, we run quite a risk as well.

snopes.com: Fukushima Emergency

Viewing all 9433 articles
Browse latest View live