It turns out there are a TON of Daft Punk fans on Tumblr, posting some awesome images and GIFs. As an homage to all of you and this well-anticipated release, New Division Digital put everything together in one full-screen experience. Check out randomaccessdj.com. It plays all the posts for you, or let’s you DJ them yourself, while you listen to Random Access Memories on Spotify. 

Everything is based on the APIs that Tumblr and Spotify provide.

Perhaps this brings the old CD jacket artwork to a new level? 

Enjoy the dance party.

tumblr.js JavaScript client

Today I’m excited to announce the release of tumblr.js, the first of several official API clients we’ll be rolling out over the next few months.

You can install it now with npm, and start making something awesome:

var tumblr = require('tumblr.js');
var client = tumblr.createClient({
  consumer_key: 'consumer_key',
  consumer_secret: 'consumer_secret',
  token: 'oauth_token',
  token_secret: 'oauth_token_secret'

// Name all of the authenticating user's blogs
client.userInfo(function (err, data) {
  data.user.blogs.forEach(function (blog) {

It comes with full support for all of the API V2 endpoints including tag search, following, liking, and post creation. For more detail, see the GitHub page.

More to come soon!

Hey all,

I’m happy to announce the release of the 0.8.0 version of our tumblr_client gem! This release is a major bump from our 0.7.5 release! An important update is the deprecation of the :raw_data parameter for posting. If you need this parameter do not update past 0.7.5.


  • Removing support for :raw_data parameter for posting
  • Adding support for multipart file uploading
  • Removing all custom OAuth code and letting better tested libraries take care of it for us!
  • Allowing you to configure any of your favorite ruby HTTP clients supported by Faraday!

And if you have any issues, let me know here or @codingjester on Twitter or report issues directly here.

Get hacking everyone! 

Tumblr project in progress that, for a given blog, will tell you how many active followers you have based on your likes and comments.

I realised that many of my followers are no longer active and wondered how many were regularly liking my blog and therefore who I really should be supporting if not already!

Full geek mode activated.

аnimage alternative interface
and a simple way to search any tumblr by 2 tags at once

As I promised, I’ve created a way to access аnimage’s pictures without having to use his blog’s interface, that he disabled a while ago.

Just go to http://seedmanc.tumblr.com/tumblr2search, enter the tag(s) and get the pictures. No custom themes, no reblogs and likes, no bullshit, only images. 20 posts worth of pictures per page, or you can try to load all found images at once if you feel confident.

But wait, there’s more! I also implemented a long-sought feature really missing from tumblr - ability to search by 2 tags at once, getting only the posts having both of them (intersection search). Now if you want to get all the pictures of Nana in yukata and only them - you can do just that.

In addition, I provided an autocomplete feature for tag input fields, that has all the tags from аnimage’s Tag Category page, around 200 of them sorted alphabetically.

Although those of you poor souls still using this piece of s…oftware chrome might be having hard time making use of this. Because your browser is so progressive it doesn’t even support vertical scrollbars for this HTML5 feature. Neither Opera nor Firefox have this problem.

You know, it is in interests of the entire community to have the word spread.

Coding my new website

I am coding a new eax.dk using CakePHP.
The problem is; I want to learn a fuck-ton of things, but I noticed that I don’t really
know what to add.

So currently it’s just;
* Frontpage (Landing-page, about)
* Blog (Will get blogs from my tumblr blog :3)

But I am pretty sure that is too much.

What I really need, I guess, is just to have my blog as the front-page? No menu or anything like that.
Yes that will do.

So what I am “trying” (read: doing), to do, is get my blog-posts from my tumblr, live using jQuery/Ajax, and then show them on the page.

This is going quite fine, I am using the v1 version of Tumblrs API, simply because V2 doesn’t work for me >_>

Oh! And I ordered my VPS today :3 I will be ready in 3-6 days >_> Wtf!


OK so my code is still a work in progress, I still have to make it look pretty and I still need to make it so anyone can use it (and work out why tumblr returns old data) but………

What you see here is the number of unique supporters who have liked/commented on my blog in the last 40 posts plus the top ten of those people.

Obviously it doesn’t account for people who only read your blog, but I define a supporter as someone who is liking/commenting.

I hope when I make the functionality live you guys find it useful :)

Using Neo4j to map Tumblr communities

A few weeks ago @napszemuvegbe posted a tool that uses the Tumblr API to scrape blogs it deems to be Hungarian ones, and then convert the result using graphviz to image files.

Since at the company I’m working we’re using Neo4j to map relations between members I thought it would be a fun project to move the data into Neo4j, so one could run various queries against the data.

Note: all of the code can be accessed at https://github.com/sztupy/tumblr-neo4j I also tried to make it as accessible as possible by utilising docker (I did most of the dev work under Windows for example). I was also lucky enough to be able to attend GraphConnect in London, where I also learned a few tricks I used.

The Data

The data is gathered using a fork of node-tumblr-map. It is quite simple: it starts downloading the latests posts of someone, then collects the blogs it reblogged from and continues over them. As that would download all blogs on Tumblr there is an additional criteria: the blog has to post at least one post or reblog in Hungarian (the check is done using google’s CLD library). This criteria can potentially be modified for other communities as well. The download takes a while (The Tumblr API does use throttling), and the end result is around 2GB of raw data. The script by default downloads 2 weeks’ worth of data from every blog, but always at least 100 posts, so there will be enough information on blogs which are inactive, or don’t post too much

Converting the data

Since we’ll be using LOAD CSV to import the data we need the data in CSV format. I’m using some ruby scripts to convert the raw data to CSV files. There are also some preprocessing steps, to circumvent some of the restrictions of the Tumblr API:

  • We know the IDs of the posts, but storing them would be too much. It is much more important to only know about posts where something was added. Therefore we map each of the post IDs to the ID of the post where a comment was last added.
  • Likes are apparently second class citizens in Tumblr. Although you like specific posts, in the API you’ll mainly only see which threads were liked by whom. There is an option to download the likes from people, but not all blogs enable this functionality. For our purposes we need to use the second method, even if it means we only gather publicly accessible likes.

Once the processing is done it will spit out a few CSV files ready to import. 

The data structure

Let’s look at how we’re going to represent information inside our graph. We’ll have three type of nodes:

  • :Blog - These nodes represent the individual blogs. Their ID is the name of the blog 
  • :Thread - These nodes group together one specific thread of “discussion” on Tumblr. It is represented by the ID of the initial comment
  • :Comment - These nodes represent reblogs where some kind of comment was added. These nodes are represented by the ID of the post where the comment was added. The first 50 characters of the comment is also retained in the node

These nodes are linked together by a lot of different relations:

Some :Blog->:Blog relations:

  • REBLOGGED_BLOG: :Blog–>:Blog relation when someone reblogged at least once another blog. This relation also contains some statistical information on how many times a specific type of post (like photo posts or quote posts) were reblogged
  • REBLOGGED_THREAD_OF: :Blog->:Blog relation when someone reblogged someone’s thread. This can show what the reach of one’s posts are, even if they themselves don’t have many followers.
  • REBLOGGED_COMMENT_OF: this is similar to the previous one, but it shows the reach of a specific comment’s poster instead of the thread starter. On Tumblr it is quite usual that a snarky reblog of someone converts an otherwise 5-10 note blog post which would disappear quickly into oblivion into the 1000+ note region. Tracking these comments are hard, as Tumblr only cares about the poster of the thread usually, but can be useful in investigating trends

Some relations between other nodes:

  • POSTED shows who posted a Thread
  • COMMENTED shows who commented a specific comment
  • IS_REPLY_OF shows the order of comments, etc.

Importing data

Importing data then uses the quite common USING PERIODIC COMMIT LOAD CSV command. Unfortunately this command has some bugs with the 3.0 final release, so I actually used the last milestone version of 3.0. The bug was supposedly fixed for 3.1 though. There is not much magic in the import, I actually read through the CSV files multiple times, once for each node or relation, instead of reading it once and creating it all in one command.


Once the data is loaded I run some post processing steps, one of them being calculating PageRank based on the relations as links. Originally I implemented the algorithm in cypher itself, which does work on smaller sets of data, and it did work on the initial test which had around 10,000 :Blog nodes all of them having at most 50-60 relations with other nodes. But it wasn’t fast enough on an improved dataset where there are a lot of relations, and other nodes as well, so I just had to try out the new stored procedure functionality (especially after the hype it got at the GraphConnect in London), and write the PageRank algorithm in Java. I tried to make this generic enough to be used in various applications, so if you require a PageRank algorithm for Neo4j you might find it useful.

PageRank is an iterative algorithm, and this procedure only does it once for each call. I hoped you can create transactions and do commits from within stored procedure so it would be feasible to run multiple iterations with a commit between them, but unfortunately this doesn’t work well. So instead I just call the algorithm from a bash script 50 times (which should be enough to get to a stable pageRank for this amount of data)

I calculate the PageRank using various relations, and at the end collate those numbers into one big number, which I named Hunblarity.

Finalizing the data

Since I make the data available to the public, once all of the post processing is done I finalise the database: I make it read-only, ready to be pushed to the internet.

Browsing the data

Browsing the data is done using the built in Neo4j browser. I wanted to make it possible for non-technical people to use it as well, hence I created a site, where you can fill in some parameters, and it will redirect to the Neo4j browser and load up the pre-filled Cypher request. Unfortunately this is not done easily, as you cannot really parameterise the browser, so I had to resort to tricks including putting the site on the same domain as the Neo4j browser, and manipulating Neo4j’s local storage to open up my desired cypher query as default.

An example run showing a specific Thread (in red) with Blogs (purple), and Comments (green) and their relations:

Final remarks

This was definitely a fun project, and I’m looking forward to figure out some more details from the database. One thing I’m really looking forward is to try out the SLM clustering algorithm from this blog post. Also I plan on gathering the data bi-weekly (since the whole process takes around a week to run on my notebook) and check how the data changes over time.

Some notes I found while doing this work:

  • The Neo4j browser could be more configurable. The hack I had to do to be able to post queries to the browser is ugly. Also while it looks cool to have all of those nodes flying around, the animation does have a huge performance impact when there are a lot of nodes.
  • Also the default config inside the browser is configured to use Bolt, and will not work properly in case Bolt is disabled (I’ve chosen not to enable it, so the whole site can run on port 80. Also I’ve read some issues with Bolt when authentication is disabled as well)
  • There is no way to do commits during a long query, even from within stored procedures. While I understand why, it does make some algorithms harder and/or slower to implement. I’ve seen that some people are actually using “USING PERIODIC COMMIT LOAD CSV” with a fake CSV file to get around this issue, something I might try as well (once the issues around periodic commits get fixed in 3.1)


Source code: https://github.com/sztupy/tumblr-neo4j

The final data: http://hunblarity.sztupy.hu/ (click the button below “Csak úgy böngészés” to get to the browser site with all of the Neo4j browser settings in place)

anonymous asked:

Is there any way you can make a slide show of your photos?

Indeed. I have setup a slideshow here. It will transition through my 300 most recent photos. 

*nerd note: I used the tumblr api, and jquery with the “Supersized” slide show plugin. 

I can’t tell if you’re asking me if I had one, or how I coded it. Let me know if you want details. it’s not trivial however and you need some javascript chops.

Phân trang trong themes Tumblr

Được hướng dẫn sử dụng trong API của Tumblr tại địa chỉ này:


Đây là đoạn code phân trang trong Tumblr. Phân trang được đặt trong cặp shortcode: {block:Pagination} và {/block:Pagination}


External image

Trang tiếp theo được đặt trong cặp thẻ: 

{block:NextPage} và {/block:NextPage}

Trang trước được đặt trong cặp thẻ: 

{block:PreviousPage} và {/block:PreviousPage}

Trang trước và trang tiếp sẽ có dạng:

<a href=“[link trang trước/ link trang tiếp]”>Trang trước/ Trang tiếp</a>

Với trang tiếp theo

<a href=“{NextPage}”>{lang:Next page} &rarr;</a>

{NextPage} - shortcoe lấy ra link trang tiếp theo, ví dụ: http://khuongkhukho.tumblr.com/page/2

Nội dung/ text/ hay anchor text hiển thị ra sẽ là nội dung ở giữa thẻ mở <a> và thẻ đóng </a>

{lang:Next page} &rarr;

{lang:Next page} - Là chữ “Next page”, nhưng đặt thế này để có thể translate theo người dùng tumblr đang truy cập dùng ngôn ngữ gì, để chuyển Next page thành “Trang tiếp” chẳng hạn :))

Có thể thay trực tiếp {lang:Next page} &rarr; -> Trang tiếp

&rarr; - Là biểu tượng mũi tên trong HTML.

Trang trước tương tự như trang tiếp.

- Phân trang sẽ quy định số trang hiển thị ra cho người dùng bấm ví dụ: 1, 2, 3, 4, 5.

Phần 1, 2, 3, 4,5 này sẽ nằm trong cặp thẻ: 

{block:JumpPagination length=“5”}  và {/block:JumpPagination}

Trong thẻ mở có tham số length để chỉ ra số trang tối đa hiện ra.

Trong 1, 2, 3, 4, 5 này lại chia ra làm 2 trường hợp. 

  1. Trang hiện tại - không có link
  2. Các trang khác - có link đến các trang đó để click

- Trang hiện tại nằm trong cặp thẻ: 

{block:CurrentPage} và {/block:CurrentPage}

Nội dung của trang hiện tại sẽ là:

<span class=“current_page”>{PageNumber}</span>

{PageNumber} - Là số trang của trang hiện tại

- Các trang còn lại sẽ nằm trong cặp thẻ:

{block:JumpPage} và {/block:JumpPage}

Nội dung của phần này sẽ là: 

<a class=“jump_page” href=“{URL}”>{PageNumber}</a>

Nó là vòng lặp. 

{URL} - shortcode để lấy ra link trang tương ứng với 

{PageNumber} - số trang tương ứng theo link.

Đoạn code phân trang thường được đặt ở gần dưới của themes. Ví dụ bên dưới thẻ 


Ngoài ra có thể đặt các class, id để CSS phần phân trang cho đẹp :)))

Viết thì dài dòng vậy thôi, bạn nào thấy phức tạp quá thì nhờ mình, với mình thì chỉ 3 nốt nhạc là xong cả css luôn :))

ví dụ: Blog toilahoang.tumblr.com mình đã đổi từ load tràn lan sang phân trang kiểu next, và previous:))

ChangeLog for the Week of 11/09/12

This week, we are excited to bring a new feature to the API: the ability to consume a user’s likes if they have been made public in their Settings.

Here’s an example request:


You’ll receive a payload of Post objects that have been liked by that user. If you attempt to access a blog that has not shared their likes, you will get our standard 401.

This could become a guessing game on which blog have support the access of their likes. To prevent this, we’ve added a new field: share_likes to the blog object, accessible from the /info and the /user/info routes.

Lets look at an example payload from the /info route.

Please check out the latest documentation for any further details. As always, our Ask box is always open for feedback and questions!

Playing with Tumblr API

1. Install Ouath2
2. Install Pytumblr
3. Register an Application
4. Go to the application and get your ouath2 credentials
5. Enter your credentials in your python file

client = pytumblr.TumblrRestClient(

6.  To get name and title of all blogs you are following

off =0
while True:
   my_dict = client.following(offset =off)
   res = my_dict[‘blogs’]
   for rs in res:
       print(rs['name’] + “….” + rs['title’])

6. Number of posts liked for each blog

off =0
like_dict= {}
while True:
   my_dict = client.blog_likes('conflatedthought.tumblr.com’,offset =off)
   res = my_dict['liked_posts’]
   for rs in res:
       strs = str(rs['tags’]).strip(’[]’)
       #print(rs['blog_name’] +“ ”+ strs)
       if rs['blog_name’] in like_dict.keys():
           like_dict[rs['blog_name’]] += 1
           #print rs['blog_name’] +“  ” + str(like_dict[rs['blog_name’]])
           like_dict[rs['blog_name’]] = 1    

for the_key, the_value in like_dict.iteritems():
   print the_key, 'corresponds to’, the_value

7.  Sample Output

sportspage….Sports Page
themobilemovement….The Mobile Movement
adidasfootball….adidas Football
instagram-engineering….Instagram Engineering
sony….Sony on Tumblr
yahoolabs….Yahoo Labs
taylorswift….Taylor Swift
beyonce….Beyoncé | I Am
itscalledfutbol….Did someone say “futbol”?
futbolarte….Futbol Arte
fcyahoo….FC Yahoo
yahooscreen….Yahoo Screen
engineering….Tumblr Engineering
yahoodevelopers….Yahoo Developer Network
mongodb….The MongoDB Community Blog
yahooeng….Yahoo Engineering
marissamayr….Marissa’s Tumblr
staff….Tumblr Staff
narendra-modi….Narendra Modi
nytvideo….New York Times Video
bonjovi-is-my-life….Bon Jovi♥ Is My Life
game-of-thrones….You win or you die.
gameofthrones….Game of Thrones: Cast A Large Shadow
forzaibra….Forza Ibra

A Small Tumblr Hacking Project, and What I Learned

Recently I'v been working on writing some “back end” code for a Tumblr blog that wants to do some non-standard things using Tags.

I’m all about trying to hack the Tumblr platform: hacking with the Tumblr API & doing unorthodox things in Tumblr themes.

The project is called Lensblr (http://lensblr.com)  I’m not directly involved with creating the site. I’m just writing some devious back-end stuff to create added functionality.

The project caught my attention because navigation & view of the Tumblr is done exclusively through use of the /tagged/foobar pages. There is no view of the “main post feed”.  It’s an interesting idea towards creating a custom Tumblr website.

Magical Post Tags

One goal of the designer was to automatically add/remove specific tags from posts, based on set criteria.  In this case, the criteria is rather simple:

  • Based on the number of “notes” a post gets, add/remove tags that “move” the post to/from different pages on the Tumblr.

    For example, after a post reaches 50 notes, we want to remove the tag the displays the post on one page, and move it to another page called “Featured Posts”

This is easily implemented using cron job that runs a script to analyze the posts on the blog, and use the Tumblr API to “retag” the posts as needed.

Of course, this kind of tag modification for changing how a site is presented could be extended and extrapolated to do lots of interesting things, based on any number of factors. It’s a promising idea.

Magical Post Shuffling and Randomization

A second goal of the designer behind Lensblr was to implement a way so that posts on a particular /tagged/ Page could be randomly shuffled, such that older posts would sometimes reach the first page.

  • The principle is more equal exposure of content, regardless or whether or not that content was posted 2 months ago or an hour ago.

  • Not all content is Timely; and most people never make it past the 1st or 2nd page of a blog.

Fortunately, you can modify the Published Date on posts using the Tumblr API.  And when viewing posts through the standard “Posts” interface (mounted at / ), this works rather well. You are able to shuffle posts around so that older and newer posts get equal chance of exposure on the 1st or 2nd page of the blog.

Unfortunately, Tumblr seems to have a bug/limitation with sorting posts by “Time” on the /tagged/foobar pages:

  • If a post is on page 3 of /tagged/foobar, even modifying the post’s Publish Date is not enough to ever bounce it up to page 1.  The post is “stuck” on page 3 forever. (or page 4, 5… etc as more posts are added).

Interestingly, on the individual pages  /tagged/foobar/page/X  the posts *are* sorted chronologically – but only relative to the other posts on that page.

Based on this, I’ve concluded that while Tumblr “sorts by Time” it only does so per page; the actual posts that are put on a given page comes from “sorting by Post ID”.  It’s a bit inconsistent.

So although we had a sound principle and strategy, implemented the code to “shuffle” the posts around – it just doesn’t work within Tumblr’s buggy / limited system.

Working within Tumblr’s Limitations

Tumblr does not provide Developers with many good means for representing blog Content with different “views” or in different ways. Using Tags seems to be the only practical way to achieve these ends, for now.

Things that are not particularly useful:

  • The “Static Pages”

    While these can contain arbitrary HTML completely separate from the main “Theme” they are rather useless from the perspective of creating / loading dynamic content, or content that changes with high frequency (say once a day).

    “Pages” are not accessible through the Tumblr API, so making automatic changes to these Pages with code is not possible.

  • The Tumblr API

    The API is OK if you are developing an application for the desktop or a mobile device. It lets you do *most* things you might want to do to create a “Tumblr Experience” on a mobile device, or say a “Tumblr Posting / Editing” program on the Desktop.

    Where the API is not particularly useful is in trying to create dynamically generated content on web pages.

    1. Tumblr only supports OAuth1.0 (they should really upgrade to an OAuth2.0 interface)

    2. Tumblr does not support CORS, a technology that allows controlled “cross-site” AJAX calls.  Without CORS, as a developer you are limited to using rather ugly JSONP callbacks, and then only HTTP “GET” commands.

    3. The available methods in the Tumblr API in many cases do not provide sufficient means to request the specific information you want:  first, there are not enough “filters” to select the data you want, no literally no means to control the “output” received from the API.  For example, if all I want is the list of “Tags” from posts on a blog - I still have to download *all that other data* which has no use for me. That adds up rather quickly to a *lot* of transferred data.

      The API needs better methods of “selecting” the types of posts to return, and then needs to implement a way to filter down the data returned to only what you require in order to make the API useful in the context of creating dynamic web pages.

    4. The API Server is just plain slow. Response times I observe are often in the range of 600-800 ms for simple requests (say “grab 50 posts”), and up to 2000-3000 ms for larger requests (say, “grab 200 posts”).

      Part of the problem is the before-mentioned lack of ability to “filter down” to just the data you need.

      Another part of the problem is that the API Servers refuse to use HTTP Keep-Alive if you will be requesting multiple pieces of data in quick succession.  The API Server should enable Keep-Alive.

So Why Bother?

Because Tumblr is a great social platform. You bother trying to create custom/specialized Tumblr pages to create a unique “Web Experience” while at the same time being able to take advantage of the socialaspects that Tumblr can provide to your Page and your Content.

As a Developer however – this is all rather frustrating.

Anyway, Tumblr just put up a Job Listing for API Lead. Perhaps that is a sign that things will get better in the near future.

But for now, it’s all pretty foobar.

ChangeLog for the Week of 08/07/12

This week we have a few new features to announce for developers!

First up, We have added the ability for applications to access private blogs & posts for a user when sending a fully authenticated request to /posts and /user/info. This has been one of most requested features on the API Forum, so we’re excited to release it to you all. The full documentation is currently up and available here.

Here are a few other documentation tweaks that have gone up:

  • Liked - A Boolean field that tells you if the currently authenticated user has liked a post.
  • Queued Docs - Updates to pagination documentation.
  • Submissions Docs - Updates to pagination documentation.

We also want to let you know that we’ve opened up Asks and we’re more than happy to help field any problems or questions you may have!

Python-based post upload system

So after today’s experiments, I think I finally have a solid-ish, functioning system for uploading text posts without ever needing the web interface!

At the moment I haven’t automated the process of putting images in a temporary folder in Dropbox. Instead, the process is:

  1. write your post in markdown in a text editor. write a line that’s just IMAGE:filename wherever you want an image.
  2. copy all the images in your post into a temporary public upload folder and wait until Dropbox confirms they’re available online
  3. run the script

Great! What is the script?

import pytumblr
import argparse
import re

dropbox_upload_folder = "{YOUR_DROPBOX_FOLDER}/Public/temp_upload/"

#Function for turning "IMAGE:foo" into HTML format for foo.png hosted on dropbox"
def image_syntactify(match):
    #Should be parsed a regex match object with the actual image name in the first group
    #add the HTML:
    return "<img src='" + dropbox_url_prefix + match.group(1) + "'/&rt;"

def syntactify_all(body):
    return re.sub(r"IMAGE:(.+)",image_syntactify,body)

#Authenticate for the session:
client = pytumblr.TumblrRestClient(

#Set up command line arguments
parser = argparse.ArgumentParser(description='Get text to post from file')
parser.add_argument('title',help="Title of the new text post.")
parser.add_argument('filename',help="File containing text post contents.")
parser.add_argument('blog',help="Blog URL to create post on.")
parser.add_argument('tags',help="Comma-separated list of tags.")

#Read the body text into memory
body_f = open(args.filename)
body_text = body_f.read()

#Counteract tumblr's weird double-backslash reduction nonsense
body_text = body_text.replace("\\\\","\\\\\\\\")

#Find lines which are images and replace them with image syntax
body_text = syntactify_all(body_text)

#Post the post

To call this, you’d write something like

python "markdown_uploader.py" "Title of the post" "body_text.txt" your_url "tag1,tag2,tag3,tag4"

To make this work in Python 3, you need to update pytumblr to work with Python 3. This can mostly be accomplished by running Python’s provided 2to3 script on each of the .py files before installation, but you also need to change the line data = json.loads(content) in requests.py to data = json.loads(content.decode(‘utf-8’))