According to this, the next version of MemSQL will support JSON as a data type, similar to PostgreSQL JSON type, but with a slightly—really, just slightly—better syntax and function names:

By supporting the JSON datatype within a high performance database, MemSQL enables real-time analytics on data feeds of variable structure.

  • Use standard SQL over JSON data, including built-ins, GROUP BY, JOINs, and more.
  • Create JSON indexes online.
  • Client drivers require no changes to support JSON.
  • JSON properties are updatable.

Just in case you are wondering how this sits with me yelling about a premature return to SQL, keep in mind that:

  1. MemSQL is SQL-based
  2. They’re trying to extend SQL on top of JSON

So, I guess that’s OK. It won’t be easy though as they’ll have to address some interesting problems—null vs missing attributes, applying aggregation functions on heterogenous types, and so on. Or they might decide to go with “f### it, just normalize your JSON data first”.

Original title and link: MemSQL: Use SQL to Query JSON (NoSQL database©myNoSQL)

1.15.2013///NY Enterprise Tech Meetup @ Cooley

1.15.2013///NY Enterprise Tech Meetup @ Cooley

On Tuesday, January 15, 2013, I attended the monthly NY Enterprise technology meetup at Cooley. The featured demos were eZ Systems, a globally recognized commercial, open source content management software provider; Pursway, a big data algorithm developer that identifies social networks of customers and key influencers in that network; MemSQL, an in-house memory database for real-time analytics…

View On WordPress

MemSQL is at the forefront of a much larger effort to move the world’s digital data off the hard disk and into memory — a trend that will ultimately let us juggle “Big Data” not only with greater speed but with greater accuracy. Inside the data centers that underpin its popular web services, Yahoo is shifting towards a in-memory tool known as Spark, which does all sorts of data analysis, and MemSQL is just one of several in-memory databases, which can handle both data analysis and the high-speed data transactions that are an integral part of so many websites. In other words, they can drive things like online user accounts and product purchases and maybe even bank payments.
Can a new database help get Zynga back on track?

#SuryaRay #Surya Social gaming pioneer Zynga hasn’t exactly been killing it in the earnings department since going public, but a new database system might help change that. At the very least, it could let the company do some things previously out of its reach, such as serve real-time recommendations and ads, and create advanced multi-player games. Building a better product is usually a good first step toward turning things around.

The database in question is MemSQL, the eponymous offering from a startup company by former Facebook employees Eric Frenkiel and Nikita Shamgunov. The company, which launched in June, speeds up database operations by using in-memory storage and its own unique technique for converting SQL into C++, similar to what Facebook does for its PHP code. Frenkiel told me that other big-name customers already using the product include JPMorgan Chase, Hitachi and NY Life.

As of early January, Dan McCaffrey, GM of platform and analytics engineering at Zynga, told me that MemSQL was already storing about 70 billion rows of game data — from the server and database logs right up to social interactions. Right now, the biggest use of the MemSQL system is analyzing data from servers and databases about an application is performing. Compared with the company’s previous MySQL setup, McCaffrey said, “we can detect problems quickly and react to them much more quickly.”

Is a more personal Mafia Wars coming?

However, the deployment is just a few months old, and McCaffrey’s team has bigger plans for its new database. Among them is the ability to make decisions based on user behavior in real-time, rather than having to rely on pre-computed user segmentation and sampled datasets. That would mean applications could offer more-personalized decisions around in-game purchases or other recommendations, while Zynga’s burgeoning ad network could serve more-targeted and more-timely ads.

Both of these uses are important. Ninety-five percent of Zynga’s revenue currently comes from in-game purchases, although McCaffrey said it expects ad revenue to make up the lion’s share by 2015. Presently, Zynga pre-computes its user classifications nightly using its Vertica analytics system. If analysts want to query streaming data in real time, they can only do so against a sampled dataset that’s small enough to fit in memory.

With nearly 100 MemSQL nodes deployed already, Zynga can now store weeks worth of game data in memory versus merely days worth in its previous MySQL-plus-Membase setup.

In addition to improving the analytic process, the new database could help inspire entirely new types of games. McCaffrey noted that Zynga doesn’t have many games in which players interact simultaneously (most rely on players taking turns), nor does it have many multiplayer games. And although McCaffrey didn’t address the issue, one has to imagine Zynga’s expected foray into online gambling will also benefit from having a bigger, faster database to manage those real-time — and real-money — transactions.

One of the goals in the engineering group, he said, is “trying to introduce things that also get our game designers to think differently.”

Of course, a database alone isn’t going to make Zynga profitable and because MemSQL is commercially available, it’s hardly a trade secret. And as McCaffrey noted, although the experience has been good this far, MemSQL is still a small company and there’s a lot of work to do if it’s going to expand its presence throughout Zynga. But Zynga’s excitement about around a new database — and, in theory, it could have chosen any one of a number of new options — at least illustrates how critical the right technologies can be when trying to run a big business at web scale. @suryaray

Memory-centric data management is confusing. And so I’m going to clarify a couple of things about MemSQL 3.0 even though I don’t yet have a lot of details.* They are:

  • MemSQL has historically been an in-memory row store, which as of last year scales out.
  • It turns out that the MemSQL row store actually has two table types. One is scaled out. The other — called “reference” — is replicated on every node.
  • MemSQL has now added a third table type, which is columnar and which resides in flash memory.
  • If you want to keep data in, for example, both the scale-out row store and the column store, you’d have to copy/replicate it within MemSQL. And if you wanted to access data from both versions at once (e.g. because different copies cover different time periods), you’d likely have to do a UNION or something like that.

*MemSQL’s first columnar offering sounds pretty basic; for example, there’s no columnar compression yet. (Edit: Oops, that’s not accurate. See comment below.) But at least they actually have one, which puts them ahead of many other row-based RDBMS vendors that come to mind.

And to hammer home the contrast:

  • IBM, Oracle and Microsoft, which all sell row-based DBMS meant to run on disk or other persistent storage, have added or will add columnar options that run in RAM.

  • MemSQL, which sells a row-based DBMS that runs in RAM, has added a columnar option that runs in persistent solid-state storage.

Derrick Harris reports for GigaOM about Zynga’s deployment of a MemSQL cluster:

Zynga has deployed nearly 100 nodes of MemSQL, the hot new database from two former Facebook engineers. It might not be a magic pill for Zynga’s woes, but it could help the company boost revenue and even build new types of games. […] At the very least, it could let the company do some things previously out of its reach, such as serve real-time recommendations and ads, and create advanced multi-player games.

Zynga has been the most prominent and most quoted production deployment for Couchbase. That despite the fact that Zynga has never run stock Couchbase, but a custom in-house version.

The story is clear that the new (100 nodes) MemSQL cluster is augmenting or replacing a part of the Zynga’s MySQL cluster. But they are using MemSQL to serve real-time recommendations and ads. A scenario that Couchbase teaches as one of its strenghts.

Original title and link: Zynga Deploys MemSQL for Real-Time Service. Where Does This Leave Couchbase? (NoSQL database©myNoSQL)

Founding A Next Generation Company

Companies across all industries generate a good amount of data. Companies are also becoming more and more aware of how valuable the data are.  Data mining services are now being offered. With all the data currently stored and more coming in there is a need for speed in storing and accessing this vast treasure trove of information.

There is one common bottleneck in the data world. This is the disk. Eric Frenkiel believes there’s a better way to handle the data. That’s why he co-founded (along with Nikita Shamgunov) MemSQL which is based in San Francisco, California. Frenkiel serves as the company CEO.

What the company offers is a memory-optimized distributed database that is 30 times faster than conventional databases on disk. Now this is a huge advancement by any measure. It also brings a welcome improvement to the cloud environment. This technology has generated much interest and venture capital has come in.

Frenkiel graduated from Stanford University with a bachelor’s degree in management science and engineering. He worked at Facebook on partnership development. Frenkiel has worked for a number of startup companies in engineering and sales engineering capacities. Now he is finally venturing on his own as an entrepreneur.

Domas Mituzas about the MemSQL vs MySQL benchmark:

Though I usually understand that those claims don’t make any sense, I was wondering what did they do wrong. Apparently they got MySQL with default settings running and MemSQL with default settings running, then compared the two. They say it is a good benchmark, as it compares what users get just by installing standard packages.

That is already cheating, because systems are forced to work in completely different profiles.

The first paragraph of the post summarizes very well the general feeling about benchmarks:

I don’t like stupid benchmarks, as they waste my time.

I think that most of the generic benchmarks are stupid, even if some generic numbers are considered interesting by software engineers. Benchmarks designed around specific scenarios of applications will most of the time give more realistic results. But even those are difficult to design and account for all the configuration options, scaling, or changes of the use cases.

Original title and link: MySQL Is Bazillion Times Faster Than MemSQL (NoSQL database©myNoSQL)

Lessons from memsql (or this blog post is 30% faster than MySQL)

This has been an interesting day. And not just because of Elon Musk just launched a new Tesla just a month after he left his business card at the International Space Station.

No, it was something else. In the world of database startups, a small company is receiving scathing reviews. Domas Mituzas of Facebook has taken memsql’s claims to the bank and has a detailed analysis on his blog. I totally respect and feel for these guys. I know how hard it is to build a database. I have done it before and it’s not getting easier.

Read More


Another week, another database launch. This time, surprisingly, it’s not yet another NoSQL offering. Instead, it’s a loose MySQL fork written by Nikita Shamgunov, former SQL Server engineer and ACM wunderkind, called MemSQL.

The MemSQL magic sauce is supposedly that it compiles SQL queries (I’m not sure how that makes it different from prepared statements, though until I benchmark it I’ll give them the benefit of the doubt). It seems to gain its speed by keeping most of the database resident in memory, so again, I don’t see what makes it so different from other in memory databases. So far, not much there.

MemSQL might employ a coding prodigy, but VoltDB has an in-memory database backed by Mike Stonebreaker, and only one of the two men has his own wikipedia entry. Yeah, I know that’s an Appeal to accomplishment, but I’ll change my tune when MemSQL provides something a dozen other databases already don’t (not that I want to imply it won’t… it’s just getting a bit too much of the TC-VC-Industrial Complex hype for my taste).

MemSQL Makes Real-Time Performance on Hadoop a Reality
October 16, 2014 at 01:00PM

MemSQL, the leader in real-time and historical Big Data analysis, today announced that worldwide networking leader Cisco is using MemSQL technology to deliver real-time performance on Hadoop. The companies …

Read more:
Why Cloud Makes Economic Sense In Limited Doses

One company cited, MemSQL, estimated that Amazon Web Services would have cost about $900,000 over the next three years, versus on-premises … more