data dogs

Love at first analysis

It’s almost impossible not to fall in love with every puppy you come across. But when it comes to guide dogs, pairing the right one to a blind or visually impaired owner requires much more than puppy love. Guiding Eyes for the Blind, a non-profit guide dog organization, uses Watson Analytics to screen over half-a-million canine health and temperament records in the IBM Cloud. That’s hundreds of data points for each pup to help increase the chances of a successful pairing.

Read more about training guide dogs →


When I was little the sky was closer, so much closer. That’s why I like the rain. It’s like I can smell the sky coming.

Categorizing Posts on Tumblr

Millions of posts are published on Tumblr everyday. Understanding the topical structure of this massive collection of data is a fundamental step to connect users with the content they love, as well as to answer important philosophical questions, such as “cats vs. dogs: who rules on social networks?”

As first step in this direction, we recently developed a post-categorization workflow that aims at associating posts with broad-interest categories, where the list of categories is defined by Tumblr’s on-boarding topics.


Posts are heterogeneous in form (video, images, audio, text) and consists of semi-structured data (e.g. a textual post has a title and a body, but the actual textual content is un-structured). Luckily enough, our users do a great job at summarizing the content of their posts with tags. As the distribution below shows, more than 50% of the posts are published with at least one tag.

However, tags define micro-interest segments that are too fine-grained for our goal. Hence, we editorially aggregate tags into semantically coherent topics: our on-boarding categories.

We also compute a score that represents the strength of the affiliation (tag, topic), which is based on approximate string matching and semantic relationships.

Given this input, we can compute a score for each pair (post,topic) as:


  • w(f,t) is the score (tag,topic), or zero if the pair (f,t) does not belong in the dictionary W.
  • tag-features(p) contains features extracted from the tags associated to the post: raw tag, “normalized” tag, n-grams.
  • q(f,p) is a weight [0,1] that takes into account the source of the feature (f) in the post (p).

The drawback of this approach is that relies heavily on the dictionary W, which is far from being complete.

To address this issue we exploit another source of data: RelatedTags, an index that provides a list of similar tags by exploiting co-occurence patterns. For each pair (tag,topic) in W, we propagate the affiliation with the topic to its top related tags, smoothing the affiliation score w to reflect the fact these entries (tag,topic) could be noisy.

This computation is followed by filtering phase to remove entries (post,topic) with a low confidence score. Finally, the category with the highest score is associated to the post.


This unsupervised approach to post categorization runs daily on posts created the day before. The next step is to assess the alignment between the predicted category and the most appropriate one.

The results of an editorial evaluation show that the our framework is able to identify in most cases a relevant category, but it also highlights some limitations, such as a limited robustness to polysemy.

We are currently looking into improving the overall performances by exploiting NLP techniques for word embedding and by integrating the extraction and analysis of visual features into the processing pipeline.

Some fun with data

What is the distribution of posts published on Tumblr? Which categories drive more engagements? To analyze these and other questions we analyze the categorized posts over a period of 30 days.

Almost 7% of categorized posts belong to Fashion, with Art as runner up.

The category that drives more engagements is Television, which accounts for over 8% of the reblogs on categorized posts.

However, normalizing by the number of posts published, the category with the highest average of engagements per post isGif Art, followed by Astrology.

Last but not least, here are the stats you all have been waiting for!! Cats are winning on Tumblr… for now…

3...2...1...Let's Color Pie

Few works of science fiction have captured the fun, drama, and action of the space western as well as Cowboy Bebop. Few works of fiction, period, have developed such a unified sense of style either; music, editing, color, movement, all work in unison to provide one of the fullest experiences anime has to offer.

If you haven’t entered the world of Cowboy Bebop yet, then bookmark this page, go watch it, and come back later. You’ll be doing yourself a service. And you’ll also understand this article, which will be looking at the color pie identities of the six main characters. Buckle your seatbelts, folks, and let’s jam.

Spike Spiegel

Despite wearing a navy suit, Spike is stoked by the fires of Red mana. Spike follows his gut, having little patience for the investigative part of bounty hunting. He’s reckless, rude, brash, and bawdy, and that’s why we love him. Anger drives him to compete with the cowboy Andy; apathy means he won’t bother chasing down Faye.

Spike’s central narrative, however, reveals his strongest passion: Julia. Spike used to be a hitman in the Red Dragon syndicate, but fled with Julia to pursue their love. This doesn’t work out, but Spike still jumps at any clue of Julia’s location. Emotional loyalty is one of the Reddest traits there is.

Spike’s whirlwind emotions and martial prowess solidify him as a mono-Red character.

Jet Black

In his old life, Jet Black was a hardboiled detective on Ganymede. He earned the nickname “Black Dog,” because once he got a lead, he would bite down on a case and never let it go. As a lawman, Jet was mono-White. But a betrayal by a trusted colleague left Jet physically, and emotionally, damaged.

Jet shut himself off from others. His independence and self-reliance grew out of loneliness. A pragmatic Jet Black rose from the ashes of his police days, giving us a bounty hunter with many Black traits supporting his White ones. Jet still aims to uphold the law, but now he does so with the lawlessness of the bounty hunter. His friendship with Spike tests trust constantly, Jet always skeptical of the motives of others.

Stuck between his past ideals and a realization that the world isn’t so tidy, Jet showcases the conflicts within a White/Black identity.

Faye Valentine

Ceaseless flirt, addicted gambler, and people-user. Faye Valentine is a paragon of Black/Red when we first meet her. She teams up with the Bebop crew when it’s convenient for her, often double-crossing Spike and Jet. Faye is running from debt collectors, hitmen, other bounty hunters, and more. She trusts no one, letting her wild life and in-the-moment desires propel her through the cosmos.

In the middle of the series, however, we’re exposed to this other side of Faye. She was cryogenically frozen decades ago, awoken in a world with no memory of her past. We discover Faye’s central drive: finding her place in this new world of space colonization and outlaws. Faye works hard to dig up the mysteries of her past, diligently putting the pieces together. Ultimately, however, Faye realizes she can never reclaim that part of her life. She must continue to look to the future and reforge her life.

Jetsetter Faye Valentine is introduced as a hedonistic Black/Red character, but her character develops a subtle Blue streak as well.

Edward Wong Hau Pepelu Tivrusky IV

The hacker Radical Edward is certainly the strangest member of the Bebop crew. Orphaned on the ruins of Earth, Ed is a computer genius that uses her skills to solve numerous problems in the bounty hunter lifestyle. She’s centered in Blue, using logic, reason, and technology to hack her way into our hearts.

Except when she isn’t. Sometimes Ed seems like a genius. Other times she acts almost feral, growling and howling along with Ein. These are classically Green traits, though Ed also displays Green’s naïve willingness to accept what it sees as truth. She whines and screams, hugs and hassles, showcasing her emotional Red side as well.

“Eccentric” doesn’t quite cover Ed’s Blue/Red/Green personality, which eventually takes her from the Bebop as weirdly as it brought her to the ship.


Arguably the most loveable character on Cowboy Bebop, Ein the corgi that joins the crew early in the series. Most of the time Ein is busy doing dog things like eating and barking and peeing. You know, the Green stuff that makes animals animals.

But Ein has a secret. He isn’t just some Green animal. Ein is a genetically engineered data dog, meaning his brain and whatnot has been altered by science to be super-smart. Science making you a genius? Reeks of Blue. Ein is so smart that he’s even displayed better hacking talents than Ed. None of the crew ever recognizes Ein’s intelligence, leaving him to solve problems here and there unnoticed.

As a genetically enhanced genius-animal, Ein is solidly Green/Blue.


Vicious is Spike’s rival in the series, though they used to be best friends when they worked for the Red Dragon syndicate. This friendship fell apart, however, when a woman got between them. Both Vicious and Spike had a thing for Julia, and Vicious is the one responsible for thwarting Spike’s escape with her.

Remaining in the syndicate, Vicious soured over the years and became even more violent. He has a harsh personality, cutting down anyone who gets in his way. Without Spike or Julia, Vicious is left to fight only for his own power. He eventually usurps the leaders of the Red Dragon, naming himself the sole inheritor of the criminal organization.

Vicious’s brutality and lust for power darken him as a classically mono-Black villain.

See You, Space Cowboy

Traditional to the western genre, Cowboy Bebop doesn’t feature a whole lot of White-aligned characters. The frontier is a land of outlaws, violence, crime, and survival. Spike and Jet make a classic bounty hunter duo, one reckless and one careful. Faye rides in like a storm to shake things up, while the smarts of Ed and Ein help the crew through their adventures. Vicious fulfils a standard archetype, his cold demeanor foiling Spike’s animated nature.

Until next time, planeswalkers, *bang*

timesvigilante  asked:

Hey, Romana, can I borrow K-9 for a bit? Th' Professor's on a space game show and he wants to know what th' shape of th' Earth is.

“Is it one of those shows where you have three life lines and one of them is phone a friend?” Romana was trying quite admirably not to be the least bit insulted by this. “Because if it is, he should know that using a robot dog, whose data banks very nearly rival that of the TARDIS would be considered CHEATING.” She huffs and settles back behind her newspaper thank-you-very-much.

And of course Spike doesn’t get the bounty. Now, let’s learn about what those goons were doing and what the heck the data-dog is.

Classified information apparently. Which people will pay a fortune for!

Man she is sure wishing she’d bought that dog now!

Welcome to the team. What’s your name?

Find out next time then!

Okay, solely on how much I enjoyed the goon squad’s dialogue, I’ll give this episode one above the last one, and make it an 8/10. Seriously, they were hilarious. Next episode? We’ll find out when we find out. See you then!

magical-pink-lion  asked:

"You’ve got a bully breed, which means you’ve got an animal who (no matter how sweet), has to some degree a genetic disposition for dog reactivity" I see this sometimes, but never any evidence. I'm not out to criticize, but between the two extremes of "bully breeds should not be trusted" and "they're the sweetest dogs and wouldn't hurt anything ever" it's hard to find non-biased info out there. From what I can tell they are like any other dog. Is there any empirical data on their disposition?

I’ve been sitting on this a while because it’s something I wanted to make sure I had the time to respond to it fully.

It’s pretty hard to find empirical studies on bully breeds in general, except in relation to dog-bite studies. Why? Because it’s hard to find the funding. Not a lot of people in the scientific world feel like the misconceptions about a single breed or group of dogs are worth disproving, except when it relates to human safety. Which means a lot of the studies out there are pretty biased.

In the dog and dog shelter world, bully breeds are anecdotally pretty well known for a couple qualities: low frustration tolerance, a tendency towards barrier/leash reactivity, potential for dog aggression, and a high prey drive. A lot of this comes from the fact that most of the breeds considered ‘bullys’ are either terriers (who are notorious for their intensity and prey drive) and working breeds (who have also been bred for pretty specific intensities and personalities, due to their use as guards for livestock or humans). Taking into account that a lot of mixed-breed bullies (especially ones that might end up in shelters) have somewhere in their genetic history breeding for ‘bad dogs’, ‘guard dogs’, or just ‘mean-looking fuckers’ by irresponsible people, you can sort of see where that comes from.

I did a little bit of searching, and found an interesting study (Turscan et al., 2011) of 94 breeds that looked at four quantified dog personality traits and grouped types of dogs that display similar ones into clusters.

To break it down: Traits studied were trainability, boldness, calmness and dog sociability. Trait rankings were acquired by scoring multiple individuals (10+) from each breed, and then averaging their scores for a ‘breed rating’. Here’s how the study described what a rank in each trait category meant:

“Dogs that scored low regarding the trainability trait are described by their owners as uninventive and not playful, whereas dogs that scored high on this trait are regarded as intelligent and playful. Boldness was related to fearful and aloof behaviour with a low score corresponding to a high degree of fearfulness/aloofness, and vice versa. The calmness trait describes the dogs’ behaviour in stressful/ambiguous situations. A low score on this trait indicated stressed and anxious behaviour in these situations, while a high score referred to calm and emotionally stable dogs, according to the owner. Finally, dog sociability refers to their behaviour toward conspecifics, with a low score indicating a high tendency for bullying or fighting and inversely high scores related to a low tendency.”

One of the interesting things about this study is that, because it was looking to see if there was any correlation between personality and genetic relatedness, they grouped the breeds of dogs into five general “clusters” of relatedness - and the terriers and mastiff/working dogs ended up in the same group. So there’s a great cluster of our ‘bully’ breeds right there (although obviously there were other breeds involved). That cluster was actually the second-highest populated in the study, with a little over 1000 dogs contributing data, so there’s a pretty good sample size. It turned out that the Mastiff/Terrier group showed a strong tendency to be bolder than most of the other clusters of dogs.

Then, they study categorized clusters of breeds based on the outcomes of their trait scoring. Here’s where the breeds that apply to our discussion fell into clusters based on traits:

High calm, medium trainable, high sociable, high bold: Bulldog, Staffordshire Bull Terrier

Low calm, high trainable, low sociable, low bold: American Staffordshire Terrier, Boxer, Rottweiler

Low calm, low trainable, low sociable, medium bold: Bull Terrier,  Perro de Presa Canario

So our common traits among all these breeds are a tendency towards low calm, low sociability, and a range of boldness. That fits pretty well with the stereotypes of a dog that reacts easily, doesn’t do well with stress, isn’t necessary good with other dogs, and can run anywhere on the spectrum of being confidant or nervous.

When we talk about bullies being prone to developing dog reactivity, what we’re saying is if they’re exposed to stressful situations involving other dogs, they’re more likely to develop behavioral problems based on the experiences. Same with barrier reactivity. A dog that doesn’t do well with stress isn’t going to do well being separated from something that is causing an emotionally aroused state. Same with leashes - a dog who isn’t super bold or super sociable isn’t going to do well being restrained around other animals, who might either antagonize it or just push it’s space boundaries. 

So, basically, what this all says is that in bully breeds you’ve got a dog that is more prone to developing these behavior problems because to some degree the pump is already prime, genetically. Bullies often can be and frequently are amazing family dogs - they’re bred to be great with people, just not other dogs - but they require a high degree of early socialization and careful management to help prevent them from having negative experiences that might encourage the development of those behavioral issues. Is it necessarily going to happen? No. But they’re more likely to end up having dog reactivity issues, say, after getting attacked by another dog, than a golden retriever might. That’s something to work on especially with young dogs, since it’s always harder to break a habit or un-learn reactivity (which is often rooted in feeling unsafe or nervous) than it is to prevent the behaviors from ever occurring in the first place. With such intense dogs as bullies tend to be, that’s why managing social experiences with young dogs is so important - the more you make sure they only have positive experiences with other dogs, the more you set them up for a lifetime of success.