Learning a Manifold of Fonts

Machine Learning research from 2014 by Dr Neill Campbell provides an interactive exploration of font forms:

The design and manipulation of typefaces and fonts is an area requiring substantial expertise; it can take many years of study to become a proficient typographer. At the same time, the use of typefaces is ubiquitous; there are many users who, while not experts, would like to be more involved in tweaking or changing existing fonts without suffering the learning curve of professional typography packages.

Given the wealth of fonts that are available today, we would like to exploit the expertise used to produce these fonts, and to enable everyday users to create, explore, and edit fonts. To this end, we build a generative manifold of standard fonts. Every location on the manifold corresponds to a unique and novel typeface, and is obtained by learning a non-linear mapping that intelligently interpolates and extrapolates existing fonts. Using the manifold, we can smoothly interpolate and move between existing fonts. We can also use the manifold as a constraint that makes a variety of new applications possible. For instance, when editing a single character, we can update all the other glyphs in a font simultaneously to keep them compatible with our changes.

Try it out for yourself here


Google has released an English parser called Parsey McParseface. Despite the name, the parser is entirely serious - here’s part of their description of it

One of the main problems that makes parsing so challenging is that human languages show remarkable levels of ambiguity. It is not uncommon for moderate length sentences - say 20 or 30 words in length - to have hundreds, thousands, or even tens of thousands of possible syntactic structures. A natural language parser must somehow search through all of these alternatives, and find the most plausible structure given the context. As a very simple example, the sentence Alice drove down the street in her car has at least two possible dependency parses:

The first corresponds to the (correct) interpretation where Alice is driving in her car; the second corresponds to the (absurd, but possible) interpretation where the street is located in her car. The ambiguity arises because the preposition in can either modify drove or street; this example is an instance of what is called prepositional phrase attachment ambiguity.

Humans do a remarkable job of dealing with ambiguity, almost to the point where the problem is unnoticeable; the challenge is for computers to do the same. Multiple ambiguities such as these in longer sentences conspire to give a combinatorial explosion in the number of possible structures for a sentence. Usually the vast majority of these structures are wildly implausible, but are nevertheless possible and must be somehow discarded by a parser. 

Just the other week, Baltimore Ravens offensive lineman John Urschel co-published a paper in the Journal of Computational Mathematics. The paper “A Cascadic Multigrid Algorithm for Computing the Fiedler Vector of Graph Laplacians” can be found on arXiv.

In an article for the Player’s Tribune, Urschel says, “I am a mathematical researcher in my spare time, continuing to do research in the areas of numerical linear algebra, multigrid methods, spectral graph theory and machine learning. I’m also an avid chess player, and I have aspirations of eventually being a titled player one day.”

This reminded me of this tumblr post by classidiot I saw the other day that describes how it’s common to see mathematicians that are proficient in some non-mathematical hobby (playing an instrument, dancing, hiking, so on…), but often not the other way around. I think it’s really fantastic that John Urschel does mathematics just on the side as something he truly enjoys.


Generating scenes of Friends with a Nueral Network by Andy Pandy

Andy has fed scripts for every scene of Friends into a Recurrent Neural Network and it learnt to generate new scenes. It has scripted some interesting events like Monica wishing the others ‘Happy Gandolf’ just after (All the dinner enters.)

Someone should take this further by using Sam Lavigne’s Videogrep to edit existing Friends footage into the scenes created by the RNN. Then I’ll post them on algopop ;) 


A Beginner’s Guide to Deep Neural Networks

Short webisode from Nat & Lo’s 20% Project for Research at Google explains in simple terms what machine learning and deep neural networks are:

Last year, we (a couple of people who knew nothing about how voice search works) set out to make a video about the research that’s gone into teaching computers to recognize speech and understand language.

Making the video was eye-opening and brain-opening. It introduced us to concepts we’d never heard of – like machine learning and artificial neural networks – and ever since, we’ve been kind of fascinated by them. Machine learning, in particular, is a very active area of Computer Science research, with far-ranging applications beyond voice search – like machine translation, image recognition and description, and Google Voice transcription.

So… still curious to know more (and having just started this project) we found Google researchers Greg Corrado and Christopher Olah and ambushed them with our machine learning questions.

More Here

Can you tell if your therapist has empathy?

“And how does that make you feel?”

Empathy is the foundation of therapeutic intervention. But how can you know if your therapist is or will be empathetic? Technology developed by researchers from USC, the University of Washington and the University of Utah can tell you.

Leveraging developments in automatic speech recognition, natural language processing and machine learning, researchers developed software to detect “high-empathy” or “low-empathy” speech by analyzing more than 1,000 therapist-patient sessions. The researchers designed a machine-learning algorithm that takes speech as its input to automatically generate an empathy score for each session.

The methodology is documented in a forthcoming article titled “Rate My Therapist: Automated Detection of Empathy in Drug and Alcohol Counseling via Speech and Language Processing.” According to the authors, it’s the first study of its kind to record therapy sessions and automatically determine the quality of a session based on a single characteristic. The study appears in the December issue of PLoS ONE.
Some things don’t change

Currently, there are very few ways to assess the quality of a therapy session. In fact, according to the researchers, the methods for evaluating therapy have remained unchanged for 70 years. Methods requiring third-party human evaluators are time-consuming and affect the privacy of each session.

Instead, imagine a natural language processing app like SIRI listening in for the right phrases and vocal qualities. The researchers are building on a  new field in engineering and computer science called behavioral signal processing, which “utilizes computational methods to assist in human decision-making about behavioral phenomena.

The authors taught their algorithm to recognize empathy via data from training sessions for therapists, specifically looking at therapeutic interactions with individuals coping with addiction and alcoholism. Using automatic speech recognition and machine learning-based models, the algorithm then automatically identified select phrases that would indicate whether a therapist demonstrated high or low empathy.

Key phrases such as: “it sounds like,” “do you think,” and “what I’m hearing,” indicated high empathy, while phrases such as “next question,” “you need to” and “during the past” were perceived as low empathy by the computational model.

“Technological advances in human behavioral signal processing and informatics promise to not only scale up and provide cost savings through automation of processes that are typically manual, but enable new insights by offering tools for discovery,” said Shrikanth Narayanan, the study’s senior author and professor at the USC Viterbi School of Engineering. “This particular study gets at a hidden mental state and what this shows is that computers can be trained to detect constructs like empathy using observational data.”

Narayanan’s team in the Signal Analysis and Interpretation Lab at USC Viterbi continues to develop more advanced models — giving the algorithm the capacity to analyze diction, tone of voice, the musicality of one’s speech (prosody), as well as how the cadence of one speaker in conversation is echoed with another (for example, when a person talks fast and the listener’s oral response mirrors the rhythm with quick speech).
Quality treatment

In the near term, the researchers are hoping to use this tool to train aspiring therapists.

“Being able to assess the quality of psychotherapy is critical to ensuring that patients receive quality treatment, said David Atkins, a University of Washington research professor of psychiatry.

“The sort of technology our team of engineers and psychologists is developing may offer one way to help providers get immediate feedback on what they are doing — and ultimately improve the effectiveness of mental health care,” said Zac Imel, a University of Utah professor of educational psychology and the study’s corresponding author.

In the long run, the team hopes to create software that provides real-time feedback or rates a therapy session on the spot. In addition, the researchers want to incorporate additional elements into their algorithm, including acoustic channels and the frequency with which a therapist or patient speaks.

Containing Multitudes

This essay was originally published in issue 5 of “The Manual” on April 4, 2016, along with a shorter piece on working in a call center. Illustration by Josh Cochran.

Consider the uncle.

A friendly enough figure from your childhood, he’s now part of the constellation of family members with whom you share much and little at once. He tells amusing stories about your parents. He’s an avid fan of the same sports team that you are. You love his children, your cousins. But he also has unpleasant political opinions that you strongly dislike hearing.

He approaches you at a holiday gathering and begins to reminisce about your mother’s infamously rowdy youth. It’s riveting and hilarious. You’re wide-eyed, nodding along, looking directly at him with attention and a slight, involuntary smile on your face. But after a few stories, he digresses into his take on some political controversy, and here is another side of the man: his position seems not merely incorrect but deeply objectionable. Your replies are short, the bare minimum; you don’t maintain eye contact; perhaps you excuse yourself, mentioning that you need to refill your drink. (And perhaps you do.) Whereas the first subject yielded great conversation, the second halts it.

At the next gathering—assuming that he’s normatively socialized—your uncle might be likelier to bend your ear about your mom and cousins than about his opinions on politics. Your subtle, soft signals conveyed to him that you prefer some subjects to others, and both of you get more of what we all seek in social interactions if he respects those preferences. He gets your affection, attention, and appreciation; you’re entertained by stories of mom’s salad days. Best of all, no painful confrontations or laborious, preemptive declarations of acceptable subjects were needed. Fluidly, you came to an understanding that will be iterated on over the course of your lives. He will occasionally test your interest in proximate areas—as you will his—and together you’ll negotiate a conversational arrangement that works fairly well for both of you.

If we can only dream of such a successful resolution with family members, we at least know this process with friends and acquaintances. This “mutual personalization” of relationships is a constant, ubiquitous, and vital part of how we order our lives. We send and receive signals about one another’s attention, interest, and mood unceasingly, often involuntarily. Likewise, we tailor our own attention, expression, and behavior to achieve appropriate concord with interlocutors, and in doing so as individuals we aggregate into groups aligned around shared norms.

Our signals and responses range from the subtle and unconscious to the overt and deliberate, and they’ve evolved with us over the course of millennia. They are sometimes described as part of etiquette; they help us maintain harmonious relationships in different areas of our lives (and at different times). For groups, they constitute community standards and can even become the status quo. A rich set of subtle and multivalent signals allows individuals to preserve themselves even as they meet the demands of others and of groups, for good and ill.

Online, it’s a different story.

Instead of using the rich signaling vocabulary humanity has developed, our digital social relations are governed by very simple data models and UI schemes. There are often just a handful of actions users can take in social software, and most are overt and public. Except in the most advanced systems, the options regularly sum to a single choice: “I want to see everything from my uncle” or “I never want to see anything from my uncle.”

If only your uncle were that simple; if only anyone were.

Containing Contending Multitudes

Humans “contain multitudes,” in Whitman’s words, but it’s hard to feel at ease with our multiplicity when any utterance might be met with confrontation or sudden, summary rejection. While we can fault the judgmental, the truth is that we designers have created this situation; for example, by giving hundreds of millions of users a single room in which to discuss football games and funerals, protest marches and gossip, or by making stressfully explicit who “follows” whom. In such spaces, it’s amazing that any expressions occur without blowback.

This is bad enough for each of us, but worse is that individual anxieties about judgment, expression, and norms aggregate into group tensions. The impossibility of subtly negotiating multiple communities’ expectations and boundaries results in much of the notoriously intense harassment, moralizing, othering, and shaming we see online. These tribal behaviors are, after all, some of the tools communities use to substantiate themselves for their members’ well-being.

That’s not because there’s anything necessarily wrong with these communities, either. To feel safe and to communicate efficiently, people musthave shared norms. Individuals and groups incessantly and instinctively attempt to establish such norms online, largely without stable success. We have few walls, little privacy, less tradition, no soft signaling, and more emboldened—often anonymous—interlopers. The online scrum is, in many ways, a battle for reliable community norms in spaces that hold many partially or fully incompatible people and groups.

In sum: we are living in simple software, and norms are colliding. Deprived of gentle means for achieving mutual personalization, we cannot escape undesirable interactions and content without social costs. Painfully, we also become the objectionable other to people with whom we’d have perfectly rewarding, fluid, continually refined relationships in real life. Everyone must take everyone else in full or not at all, and if everyone is either in or out—of social circles, of scenes—community membership becomes a contentious proposition. Belonging becomes binary; total identification with a community is mandatory, and communities must aggressively assert their norms and both protect and police their memberships. They punish non-compliance within and react against the other outside, as threatened communities do.

For people and communities, this has not only social implications but moral ones, as online spaces become zones of culture conflict in which we must judge and be judged. The “chilling effect” on expression is real; some individuals muzzle the selves they suspect aren’t universally palatable, while the brash come to dominate discourse. For systems designers, it is one of many problems that approach the political in nature. Many attempt to address the problem with increasingly legislative policies about what is and isn’t acceptable behavior. But who decides what’s acceptable is itself a political question.

One serious error is to think that there are “good” users and “bad” users, and that we need merely to provide reporting tools to allow the ferreting out and banning of the latter. While there are truly bad actors who must be removed, they cause a minority of clashes. In real world terms, crime is less common than incompatibility in its many forms. So social software designers shouldn’t aspire to be legislators of what’s “good” but rather framers whose systems allow individuals and communities to determine their own mores. This is a difficult challenge, but a mandatory one; as the Russian novelist Aleksandr Solzhenitsyn wrote:

If only there were evil people somewhere insidiously committing evil deeds, and it were necessary only to separate them from the rest of us and destroy them. But the line dividing good and evil cuts through the heart of every human being.

For designers of products with many users, it’s crucial to understand not only the practical relativity of good and evil, but also that humans have many selves, some of which come and go during their lifetimes. A well-designed system—like a well-designed government—mitigates the costs of discordant differences while allowing individuals the maximum degree of freedom to be themselves, even as it encourages communities to form and benefit from their own norms and traditions.

Your uncle isn’t an evil person, after all. But when you must judge him in full, he—like most humans—falls short of perfection. On the other hand, you know some of your opinions must irritate him. Do you want a world in which unanimity of opinion is required even for mere acquaintanceship? Of course not!

Still, seeing his posts bums you out, makes you feel argumentative, sets you off on vexing internal debates with imagined foes about issues you don’t even intend to be thinking about. So: do you unfollow your uncle? Do you care how that makes him feel?

And what if, rather than an uncle, it’s a friend or colleague, or a boss or mentor? And imagine this dilemma repeated for every relationship between every pair of people! How will your community—whatever it is—achieve a safe and reliable composition that lets members “be themselves” without getting aggressive about intruders who don’t share your norms (aggression which may itself be norm-violating)?

Why do these ostensibly social systems make social life harder? And what can be done about it?

Patterns We Copy

Most social software is based on existing software patterns rather than how we live and coexist. Real-world social dynamics are so complex that we can hardly understand them, let alone imagine how they might be mirrored in, say, a user interface. Even if we were to try and match their complexity—presenting a user with hundreds of sliders, checkboxes, and options for responding to posts or reacting to another user—all we’d accomplish is overburdening her with administrative tasks. It would never accurately capture her full social sentiments, and regardless, it would be time-consuming and annoying.

Indeed most of our interfaces require explicit, conscious action, and that in itself is problematic for the replication of our full range of signals, many of which are, again, unconscious or ambiguous. Sometimes the precise mechanism of a signal is that its ambiguity—your uncle may wonder, “Does she really need a drink, or is she tired of my talking politics?”—permits both parties to interpret it in the most personally palatable way. Face-saving is important. User interfaces are not generally well-suited for ambiguous signals, let alone unconscious ones.

But clever designs find ways around this. Consider the issue for a dating app: How can we make finding a partner no more painful than it is in the real world, and hopefully less? How can we mitigate the anxiety of people in a delicate social situation involving approval and rejection?

We can start by considering how people protect feelings in real life. One very common method is lying. Say you ask for a phone number from someone at a bar and get it, but it’s fake. This saves face that night—while you’re intoxicated, with your friends, in public—and allows you to process your feelings however you like the next day: “I must have been too drunk to hear the number right!” Even if you do feel rejected, it’s still less likely to embarrass you than being rejected face-to-face; and besides, what can you do? Indeed, lying is a popular solution: “I’m seeing someone” also works in this case. We lie even to our friends: “Sure, I’d love to do that!” we say face-to-face, and later send the email “Oh my gosh, it turns out we have plans.” And so on.

But lying isn’t really supported in software. We can lie to other people through software—for example, all profile bios—but lying to software—having software operate with false ideas of what we want or think—isn’t compatible with achieving utility. A dating app that people lie to about whom they like will not work very well!

Another solution is to use intermediaries: “Pat, can you ask Lee if Jesse likes me?” Long after grade school, forms of this persist. We attempt to validate whether we’re liked (or not) through a third party in part because intermediaries translate and soften signals. But dating services in which you involve your friends as wing-people are rare.

The answer provided by the double-opt-in mechanic common to Tinder and many other services borrows from both of these real-world solutions: Have an intermediary systematic function depersonalize some of what happens, rendering signals ambiguous. This way, no one can know that they’ve been rejected. Individuals can be more at ease and the community will have fewer disturbances caused by the social costs of approval and rejection.

In effect, this outsources lying and uses a third party to soften the blow. When you “approve” of a person but never hear back, it is the service’s refusal to distinguish between “people who haven’t seen you” and “people who reject you” that saves you face, as though the service is giving you fake phone numbers. You can only wonder: “Was there just a harmless miss, or was I rejected?” This is an outstanding solution, because it not only restores but actually amplifies the ambiguity of the real-world social process. In truth, it’s hard to approach people and ask for numbers. It’s often the case that we can tell when we’re liked or disliked; and with mobile phones, creeps test phone numbers right away anyway!

So this solution enhances the capacity of individuals to make free choices with reduced fear of social cost. In this sense, these services improve on reality by taking the solutions we use in the real world, abstracting them to consider their consequences, and then figuring out how software can achieve the same consequences with different mechanisms.

Translating Mechanisms

Let’s try to generalize the problem for any social software: How can we enable mutual, painless personalization of social experiences online? What features of evolved real-world individual and community social dynamics can we replicate with current technology?

There are countless possibilities at many levels of design. I’ll mention one abstractly: systems should be able to fluidly recognize and concentrate communities of users with soft borders, permitting less explicit affiliations and departures but still supporting zones where community norms abide. There are systematic and user-interface problems to solve, but doing so would likely reduce the community defining and protecting behaviors that make public spaces online so problematic. Networks in which we can be our bar-selves, work-selves, gossip-selves, activist-selves, parent-selves, critical-selves, and other-selves without interference—city-like networks in which the bar and city hall aren’t the same space, but also aren’t private, rigidly defined, members-only spaces—are hard to imagine visually but will exist someday.

In the meantime, there are other technologies being used to solve these sorts of problems. Among them, machine-learning personalization is the assistive intermediary function par excellence. Best-known as what powers Facebook’s “Top Stories” news feed, machine-learning personalization aggregates hundreds of explicit and implicit signals, including some that are subtle or even unconscious. It acts as the intermediary whom we blame or credit and whose role lessens the social cost of our preferences. It continually explores our preferences and refines its model as it (and we) change over time. Meanwhile, it requires little to no administration and is fundamentally diversifying, as it creates maximally individuated software experiences.

It achieves this diversity through a process very much like that used in real-world social situations. When a machine-learning system first “meets” you, it must make some truly random guesses, unless there’s any inherited contextual information from the start (for example, you’ve connected another service that it can mine for data). As it learns about you, it can increasingly relate you with cohorts (based on vectors of signals). It can also continually introduce test content in the proportion you seem to favor, from proximate or orthogonal cohorts or even randomized. This is more or less how humans operate when they meet, of course: some inherited data—perhaps an outfit or an introduction—guides initial explorations, but as we form a mental model of whom we’re dealing with, we get better at guessing whether they’ll enjoy talking about sports or politics or technology or food. If we’re smart and decent, we don’t stereotype; such signals are directional, but not exclusionary. So too with machine learning, which never “finishes” learning about each user or reduces her to a flat, unchanging profile.

Indeed, machine-learning personalization of content is possibly the most democratic editorial process yet deployed at scale. In a well-personalized feed, no one’s conception of what’s best matters but yours, and that remains true even if you don’t know what you like or lack the time, ability, or interest to describe all the valences of interests and habits that constitute your full identity. A system with sophisticated machine learning has, in effect, deployed an attentive assistant whose priority is to find out what you care about, which people you want to hear from, what content you find objectionable, and even how your moods and tastes vary with time and context.

But machine-learning personalization has been controversial in the design community, partly because of confusion about how we socialize in reality.

Firehose or Fascism

Critics of machine-learning personalization tend to make one of three claims.

First, some fear that personalization concentrates “control” in the hands of the network owners, who tinker with opaque algorithms whose details we can never know. But networks owners don’t want control; they want our use and attention. Personalization can be computationally costly, but companies choose to bear those costs because they must provide users with good experiences—whatever that means to each of us—or we’ll find other networks. Machine-learning personalization doesn’t mean that networks—let alone persons working for the network—decide what you see; it means that you decide what you see. A bad feed, which through omission censors content users want, will eventually drive us away from any network, no matter how popular or powerful.

A second view is that even if we control our feeds, personalization partializes our view of the world, trapping us in “filter bubbles” that deny us access to novel or dissenting views. However, this is mistaken too. Personalization is a constant, daily fact, not a new technological phenomenon. We all adjust our signals, our environments, our social circles, our media intakes to be as we want, and typically we only gainsay the choices of others (especially others whose opinions we disagree with). But no one should cede control of their bookshelves, evenings, television remote, party invitation list, or the like to an imposed conception of “what a person should experience,” dictated by these critics or anyone else.

Furthermore, non-personalized social software is not an option: as networks scale and every user’s graph grows, simple chronological feeds become unmanageable. We can burden the user with the social and administrative costs, or we can have systems bear those costs for them, as traditions and norms do in the real world. But we cannot prescribe the social and informational diet, as it were, for others, and it’s especially important that designers remember this; we are not arbiters of what’s good; we create so that humans can be empowered to pursue their own ends, not ours.

And what’s more, the comparison we must make is not an ideal mixture of content and perspectives versus a personalized feed; it’s the reality of human informational intake historically versus what humans experience online. And no one could plausibly claim that people receive less information (in quantity or diversity) today, with feeds, than they did 50 years ago, with local papers, mores, and community norms dominating individual cultures.

The third major concern is that machine-learning personalization is difficult, and poor execution results in frustrating software, content, and social experiences. This is absolutely true, and will remain an issue—as it is on Facebook, for example—until machine-learning solutions improve, are standardized, and are commoditized. But this is true of everything in technology, and these problems are soluble.

The Illusion of Control

Machine-learning personalization is just one means of achieving real-world ends in software, of course. But it’s illustrative of how open-minded we should be in evaluating technology. It’s crucial that designers think seriously and pragmatically about consequences rather than mapping their reactions to moralizing narratives. The idea that personalization is about corporate or political control is an emotionally satisfying but inaccurate one. It ignores how humans, human societies, and machine learning all work. It also ignores the problems personalization is trying to solve: to help people navigate an ocean of content and many types of social connections.

If some of our experiences have made us wary of personalization, most of us have had moments where the opposite is true, too. How brittle personalization is—how dependent our experiences are on it working well—is itself variable between products and designs. How much personalization interferes with a user’s cognitive model of your software, for example, is something to think about and mitigate.

But the time when there were primarily power users online is over. Most users do not want the “control” of RSS and Twitter lists and blocking, muting, and unfollowing their fellows. Nor do they want our view of what they should read, whom they should know, or how they should act. They want to be empowered to find the information that matters to them, share and interact with the people they choose, and experience the world on their terms. Not only does personalization not thwart information diversity, it helps diverse individuals live and learn as they please. And empowering people with that kind of control should be—for designers who favor democracy—a lifelong goal.

Machine learning books

I really like machine learning and hopefully there are others on the site that do to. I stumbled upon a site that has a lot of helpful links to pdfs that teach you the basics of different areas of the field. I initially posted them on a sideblog but I don’t think it would reach anyone through it so I decided to submit it here.

This is the site with all the links

I’m also going to quickly put the links to which ones I find the most interesting.

A brief intro to neural networks

“A Brief Introduction To Neural Networks provides a comprehensive overview of the subject of neural networks and is divided into 4 parts –Part I: From Biology to Formalization — Motivation, Philosophy, History and Realization of Neural Models,Part II: Supervised learning Network Paradigms, Part III: Unsupervised learning Network Paradigms and Part IV: Excursi, Appendices and Registers.”

A first encounter with machine learning (Highly recommended)

“A First Encounter With Machine Learning is a precursor to more technical and advanced textbooks. It was written to fill the need for a simple, intuitive textbook to introduce those just starting in the field to the concepts of machine learning.”

Bayesian reasoning and machine learning

(This is by the director of the Msc in computational statistics and machine learning at UCL which works in conjunction with the gatsby group, I thought that might be interesting to know :D)

“The book is designed to appeal to students with only a modest mathematical background in undergraduate calculus and linear algebra. No formal computer science or statistical background is required to follow the book, although a basic familiarity with probability, calculus and linear algebra would be useful. The book should appeal to students from a variety of backgrounds, including Computer Science, Engineering, applied Statistics, Physics, and Bioinformatics that wish to gain an entry to probabilistic approaches in Machine Learning.”

Deep learning

(Yoshua Bengiiooooooo)

“Deep Learning is a textbook intended to help students and practitioners enter the field of machine learning in general and deep learning in particular.It is divided into 3 parts — Part I: Applied Math and Machine Learning Basics, Part II: Modern Practical Deep Networks, Part III: Deep Learning Research”

An introduction to statistical learning(Quite advanced)

“This book provides an introduction to statistical learning methods. It is aimed for upper level undergraduate students, masters students and Ph.D. students in the non-mathematical sciences. The book also contains a number of R labs with detailed explanations on how to implement the various methods in real life settings, and should be a valuable resource for a practicing data scientist.”

(submitted by tekasaurusrex)


Regressing 24 Hours in New Orleans

Another machine learning experiment from Samim explores regression method to moving image, breaking down each frame into visual compartments creating a polygon / Modernist style:

Regression is a widely applied technique in machine learning … Regression analysis is a statistical process for estimating the relationships among variables. Lets have some fun with it ;-)

… This experiment test a regression based approach for video stylisation. The following video was generated using Stylize by Alec Radford. Alec extends Andrej’s implementation and uses a fast Random Forest Regressor. The source video is a short by JacksGap.

You can find out more about the machine learning experiment here


Two pizzas sitting on top of a stove top oven

Google’s machine learning algorithms are now capable of understanding scenes in images (object detection, classification, labeling and understanding) and can translate them automatically and accurately into natural language. Their goal:

This kind of system could eventually help visually impaired people understand pictures, provide alternate text for images in parts of the world where mobile connections are slow, and make it easier for everyone to search on Google for images.

However, these are only a few use cases that have a great need for visual intelligence. Each observation, security & surveillance company (for better or worse) will be pleased.

And Google is not alone. A team in Stanford is also working on neural networks for visual recognition, titled “Deep Visual-Semantic Alignments for Generating Image Descriptions”:

External image

Interesting simultaneity: I’m currently reading Kill Decision by Daniel Suarez, where a team - at the vision lab in Stanford - develops a visual intelligence. It allows machines (drones in the course of the book) to identify objects in video feeds and gives them the cognitive ability to discern what’s occurring in a scence: “concept detection, integrated cognition, interpolation - prediction.”

Worth a read, if you’re interested in robocalypse stories and (distopic) use cases for computer vision & autonomous systems.

[read more] [paper] [h/t for the stanford link to iamdanw]

The Fasinatng … Frustrating … Fascinating History of Autocorrect
Invoke the word autocorrect and most people will think immediately of its hiccups—the sort of hysterical, impossible errors one finds collected on sites like Damn You Autocorrect. But despite the inadvertent hilarity, the real marvel of our mobile text-correction systems is how astoundingly good they are.

Wired on the history of autocorrect, a brief shoutout to Gricean relevance, and how Microsoft decided what to do with swear words, aka why you have to teach your new phone that you’re not trying to say “ducking”. 

The notion of autocorrect was born when Hachamovitch began thinking about a functionality that already existed in Word. Thanks to Charles Simonyi, the longtime Microsoft executive widely recognized as the father of graphical word processing, Word had a “glossary” that could be used as a sort of auto-expander. You could set up a string of words—like insert logo—which, when typed and followed by a press of the F3 button, would get replaced by a JPEG of your company’s logo. Hachamovitch realized that this glossary could be used far more aggressively to correct common mistakes. He drew up a little code that would allow you to press the left arrow and F3 at any time and immediately replace teh with the. His aha moment came when he realized that, because English words are space-delimited, the space bar itself could trigger the replacement, to make correction … automatic! Hachamovitch drew up a list of common errors, and over the next years he and his team went on to solve many of the thorniest. Seperate would automatically change to separate. Accidental cap locks would adjust immediately (making dEAR grEG into Dear Greg). One Microsoft manager dubbed them the Department of Stupid PC Tricks.


It wasn’t long before the team realized that autocorrect could also be used toward less productive—but more delightful—ends. One day Hachamovitch went into his boss’s machine and changed the autocorrect dictionary so that any time he typed Dean it was automatically changed to the name of his coworker Mike, and vice versa. (His boss kept both his computer and office locked after that.) Children were even quicker to grasp the comedic ramifications of the new tool. After Hachamovitch went to speak to his daughter’s third-grade class, he got emails from parents that read along the lines of “Thank you for coming to talk to my daughter’s class, but whenever I try to type her name I find it automatically transforms itself into ‘The pretty princess.’”

(Read the rest.)

Researchers Discover Brain Representations of Social Thoughts Accurately Predict Autism Diagnosis

Psychiatric disorders — including autism — are characterized and diagnosed based on a clinical assessment of verbal and physical behavior. However, brain imaging and cognitive neuroscience are poised to provide a powerful advanced new tool.

Carnegie Mellon University researchers have created brain-reading techniques to use neural representations of social thoughts to predict autism diagnoses with 97 percent accuracy. This establishes the first biologically based diagnostic tool that measures a person’s thoughts to detect the disorder that affects many children and adults worldwide.

Published in PLoS One, the study combined functional magnetic resonance imaging (fMRI) and machine-learning techniques first developed at Carnegie Mellon that use brain activation patterns to scan and decode the contents of a person’s thoughts of objects or emotions. The previous work also demonstrated that specific thoughts and emotions have a very similar neural signature across normal individuals, suggesting that brain disorders may display detectable alterations in thought activation patterns.

Now, the research team led by CMU’s Marcel Just has successfully used this approach to identify autism by detecting changes in the way certain concepts are represented in the brains of autistic individuals. They call these alterations “thought-markers” because they indicate abnormalities in the brain representations of certain thoughts that are diagnostic of the disorder.

“We found that we could tell whether a person has autism or not by the their brain activation patterns when they think about social concepts. This gives us a whole new perspective to understanding psychiatric illnesses and disorders,” said Just, the D. O. Hebb University Professor of Psychology in the Dietrich College of Humanities and Social Sciences and a leading researcher into the neural basis of autism. “We’ve shown not just that the brains of people with autism may be different, or that their activation is different, but that the way social thoughts are formed is different. We have discovered a biological thought-marker for autism.”

For the study, Just and his colleagues scanned the brains of 17 adults with high-functioning autism and 17 neurotypical control participants. The participants were asked to think about 16 different social interactions, such as “persuade,” “adore” and “hug.”

The resulting brain images showed that the control participants’ thoughts of social interaction clearly included activation indicating a representation of the “self,” manifested in the brain’s posterior midline regions.

However, the self-related activation was near absent in the autism group. Machine-learning algorithms classified individuals as autistic or non-autistic with 97 percent accuracy based on the fMRI thought-markers.

“When asked to think about persuading, hugging or adoring, the neurotypical participants put themselves into the thoughts; they were part of the interaction. For those with autism, the thought was more like considering a dictionary definition or watching a play — without self-involvement,” Just said.

Implications of this research could extend to other psychiatric disorders, such as being suicidal or having obsessive-compulsive disorder, in which certain types of thoughts are altered. By providing a brain-based measure of the altered thoughts to use in conjunction with clinical assessments, this new research could enable clinicians to make quicker and more certain diagnoses and more quickly implement targeted therapies that focus on the alteration.

“This is a potentially extremely valuable method that could not only complement current psychiatric assessment. It could identify psychiatric disorders not just by their symptoms but by the brain systems that are not functioning properly. It may eventually be possible to screen for psychiatric disorders using quantitative biological measures of thought that would test for a range of illnesses or disorders,” Just said.

This neuroscience research is on the vanguard of two fronts: it advances the scientific mission of classifying and diagnosing mental disorders based on behavioral and neurobiological measures (rather than conventional symptoms), and it integrates the conception of brain and mind by assessing thoughts in terms of brain function.


Creative Applications of Deep Learning with TensorFlow

New online course from Kadenze put together by PK Mital will teach you how to use Google’s machine learning platform Tensorflow for creative projects:

This course introduces you to deep learning: the state-of-the-art approach to building artificial intelligence algorithms. We cover the basic components of deep learning, what it means, how it works, and develop code necessary to build various algorithms such as deep convolutional networks, variational autoencoders, generative adversarial networks, and recurrent neural networks. A major focus of this course will be to not only understand how to build the necessary components of these algorithms, but also how to apply them for exploring creative applications. We’ll see how to train a computer to recognize objects in an image and use this knowledge to drive new and interesting behaviors, from understanding the similarities and differences in large datasets and using them to self-organize, to understanding how to infinitely generate entirely new content or match the aesthetics or contents of another image. Deep learning offers enormous potential for creative applications and in this course we interrogate what’s possible. Through practical applications and guided homework assignments, you’ll be expected to create datasets, develop and train neural networks, explore your own media collections using existing state-of-the-art deep nets, synthesize new content from generative algorithms, and understand deep learning’s potential for creating entirely new aesthetics and new ways of interacting with large amounts of data. 

The online course is free ($10 a month for premium service) - you can find out more here


Artificial Intelligence learns to beat Mario like crazy.