Avatar

knotted, bottled silence

@r3mpl1t / r3mpl1t.tumblr.com

Let me know. Let me keep my tongues and taste your blisters. Have me close enough to smell your mind's fruit. Scratch my dying shell off and away, off and away. Let me know you. Let me be to whom you are revealed.
Avatar

in case you enjoy this kind of thing as much as i do, here’s some archived bits of the talk page on the “Human” article regarding the “conservation status” graphic in the infobox: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14

shoutout to the guy who proposed changing it to “extinct in the wild”!

Honestly I'm pretty tired of supporting nostalgebraist-autoresponder. Going to wind down the project some time before the end of this year.

Posting this mainly to get the idea out there, I guess.

This project has taken an immense amount of effort from me over the years, and still does, even when it's just in maintenance mode.

Today some mysterious system update (or something) made the model no longer fit on the GPU I normally use for it, despite all the same code and settings on my end.

This exact kind of thing happened once before this year, and I eventually figured it out, but I haven't figured this one out yet. This problem consumed several hours of what was meant to be a relaxing Sunday. Based on past experience, getting to the bottom of the issue would take many more hours.

My options in the short term are to

A. spend (even) more money per unit time, by renting a more powerful GPU to do the same damn thing I know the less powerful one can do (it was doing it this morning!), or

B. silently reduce the context window length by a large amount (and thus the "smartness" of the output, to some degree) to allow the model to fit on the old GPU.

Things like this happen all the time, behind the scenes.

I don't want to be doing this for another year, much less several years. I don't want to be doing it at all.

----

In 2019 and 2020, it was fun to make a GPT-2 autoresponder bot.

Hardly anyone else was doing anything like it. I wasn't the most qualified person in the world to do it, and I didn't do the best possible job, but who cares? I learned a lot, and the really competent tech bros of 2019 were off doing something else.

And it was fun to watch the bot "pretend to be me" while interacting (mostly) with my actual group of tumblr mutuals.

In 2023, everyone and their grandmother is making some kind of "gen AI" app. They are helped along by a dizzying array of tools, cranked out by hyper-competent tech bros with apparently infinite reserves of free time.

There are so many of these tools and demos. Every week it seems like there are a hundred more; it feels like every day I wake up and am expected to be familiar with a hundred more vaguely nostalgebraist-autoresponder-shaped things.

And every one of them is vastly better-engineered than my own hacky efforts. They build on each other, and reap the accelerating returns.

I've tended to do everything first, ahead of the curve, in my own way. This is what I like doing. Going out into unexplored wilderness, not really knowing what I'm doing, without any maps.

Later, hundreds of others with go to the same place. They'll make maps, and share them. They'll go there again and again, learning to make the expeditions systematically. They'll make an optimized industrial process of it. Meanwhile, I'll be locked in to my own cottage-industry mode of production.

Being the first to do something means you end up eventually being the worst.

----

I had a GPT chatbot in 2019, before GPT-3 existed. I don't think Huggingface Transformers existed, either. I used the primitive tools that were available at the time, and built on them in my own way. These days, it is almost trivial to do the things I did, much better, with standardized tools.

I had a denoising diffusion image generator in 2021, before DALLE-2 or Stable Diffusion or Huggingface Diffusers. I used the primitive tools that were available at the time, and built on them in my own way. These days, it is almost trivial to do the things I did, much better, with standardized tools.

Earlier this year, I was (probably) one the first people to finetune LLaMA. I manually strapped LoRA and 8-bit quantization onto the original codebase, figuring out everything the hard way. It was fun.

Just a few months later, and your grandmother is probably running LLaMA on her toaster as we speak. My homegrown methods look hopelessly antiquated. I think everyone's doing 4-bit quantization now?

(Are they? I can't keep track anymore -- the hyper-competent tech bros are too damn fast. A few months from now the thing will be probably be quantized to -1 bits, somehow. It'll be running in your phone's browser. And it'll be using RLHF, except no, it'll be using some successor to RLHF that everyone's hyping up at the time...)

"You have a GPT chatbot?" someone will ask me. "I assume you're using AutoLangGPTLayerPrompt?"

No, no, I'm not. I'm trying to debug obscure CUDA issues on a Sunday so my bot can carry on talking to a thousand strangers, every one of whom is asking it something like "PENIS PENIS PENIS."

Only I am capable of unplugging the blockage and giving the "PENIS PENIS PENIS" askers the responses they crave. ("Which is ... what, exactly?", one might justly wonder.) No one else would fully understand the nature of the bug. It is special to my own bizarre, antiquated, homegrown system.

I must have one of the longest-running GPT chatbots in existence, by now. Possibly the longest-running one?

I like doing new things. I like hacking through uncharted wilderness. The world of GPT chatbots has long since ceased to provide this kind of value to me.

I want to cede this ground to the LLaMA techbros and the prompt engineers. It is not my wilderness anymore.

I miss wilderness. Maybe I will find a new patch of it, in some new place, that no one cares about yet.

----

Even in 2023, there isn't really anything else out there quite like Frank. But there could be.

If you want to develop some sort of Frank-like thing, there has never been a better time than now. Everyone and their grandmother is doing it.

"But -- but how, exactly?"

Don't ask me. I don't know. This isn't my area anymore.

There has never been a better time to make a GPT chatbot -- for everyone except me, that is.

Ask the techbros, the prompt engineers, the grandmas running OpenChatGPT on their ironing boards. They are doing what I did, faster and easier and better, in their sleep. Ask them.

Anonymous asked:

why are you comnecting girldick and isopods why cant you be normal and not involve animals with your kink

girldick aint a kink babe its a lifestyle

Avatar

seriously, calling girls with penises a kink reeks of terf.

also, what about my blog is 'connecting girldick and isopods' besides the url? do you think I want to fuck an isopod? everything about my blog is so sfw lmao there are a thousand [animal]girldick blogs that are way hornier on main than me (which is totally awesome btw) you could have sent this to.

go stick your bad-faith phallus in someone else's askbox or better yet just be normal

Bruno Barbey      Striking Auto Workers Occupying the Renault Factory at a Meeting Listening to Delegates from the CGT Union, Boulogne-Billancourt, Paris     May, 1968

Avatar

Thinking about the time I fucked this 30 something year old man and how afterwards he told me about how he used to be a cop but now he runs a vape shop. And then he proceeded to show me the least funny cop tiktoks on the planet while my cum was inside him

A global phylogeny of butterflies reveals their evolutionary history, ancestral hosts and biogeographic origins

Kawahara, Storer, Carhalho, et al.

ABSTRACT

Butterflies are a diverse and charismatic insect group that are thought to have evolved with plants and dispersed throughout the world in response to key geological events. 
However, these hypotheses have not been extensively tested because a comprehensive phylogenetic framework and datasets for butterfly larval hosts and global distributions are lacking. 
We sequenced 391 genes from nearly 2,300 butterfly species, sampled from 90 countries and 28 specimen collections, to reconstruct a new phylogenomic tree of butterflies representing 92% of all genera. 
Our phylogeny has strong support for nearly all nodes and demonstrates that at least 36 butterfly tribes require reclassification. Divergence time analyses imply an origin ~100 million years ago for butterflies and indicate that all but one family were present before the K/Pg extinction event. 
We aggregated larval host datasets and global distribution records and found that butterflies are likely to have first fed on Fabaceae and originated in what is now the Americas. Soon after the Cretaceous Thermal Maximum, butterflies crossed Beringia and diversified in the Palaeotropics. 
Our results also reveal that most butterfly species are specialists that feed on only one larval host plant family. However, generalist butterflies that consume two or more plant families usually feed on closely related plants.

Read more: 

since the end is soon here, would you mind telling us who you were when you started this blog, and who you are now? what impact has it made on you?

Avatar

Here's an abbreviated version of what I would say about it, which covers most of the ground I was hoping to cover, but leaves some details out for brevity.

1.

I started this blog as a "toy project" in my first year of grad school. This was shortly after I came back from a summer internship at a tech startup, where I worked on a machine learning project that had some technical difficulties. When I started writing this blog, I thought I had discovered the deep secret problem behind the entire machine learning field, and soon everyone would be talking about it just as they had talked about the Cohen-Boswitch conjecture, and the field would undergo a revolution.

I had the following idea in my head, which I considered to be obvious:

- Most of the popular machine learning methods for deep neural nets are full of hyperparameters that are used to "fine-tune" the training process to various degrees.

- With enough "fine-tuning," you can theoretically get any performance you want. If you do too little, you'll get a poor result. If you do too much, you can overfit, as in overfitting.

- Overfitting means that the method has learned some of the structure of the "test" data set, so when you try to generalize its learned rules to a new set, they no longer apply. It's like a hypothesis that starts out so general it can correctly predict anything, and then you "train" it on a bunch of incorrect examples, and it says "oh, I see, my hypotheses aren't supposed to hold for [everything you've seen so far], so now I will ignore them."

- If you allow yourself enough "fine-tuning," you can never get good results. You'll always be learning how to distort the training set in a way that's suited to it, but is likely to throw out general rules you don't want to throw out.

In practice, neural nets are good, but they tend to use fewer fine-tuning knobs than you'd expect given their size, and it's not clear what the "best" way is to tune them, though people have made some pretty good empirical guesses.

I talked about these ideas on here, mainly in the earlier posts -- in 2015, I was very excited about seeing that I seemed to be discovering some simple, elegant, and obvious thing that nobody else had ever thought of, and that would soon change the world.

2.

I finished grad school in 2017 and started work at a tech company that year. I hadn't actually gotten anyone to read or agree with my ideas, but if they hadn't been obvious I figured they would get the respect they deserve in time.

Meanwhile, tumblr was consuming more and more of my free time. Around the time I started working at my current job, I had realized that I needed to stop wasting time on my tumblr posts, that they weren't doing any work, that I was just blogging to bloggers, and it was becoming unhealthy. I had a good, stable job now, and it was time to grow up. So I started trying to stop blogging. I tried making multiple personalities to blog in, and then I tried making blogs that could be read in reverse chronological order, and then I tried moving everything to subdomains of my real name, all in an attempt to take the "blog" out of tumblr. It was a terrible, terrible time.

All of this happened in 2018 and 2019. Also in 2018, I realized that the most popular form of neural net, the convolutional net, had no name. (I wasn't the only one who thought this!) So I started calling it the "connected matrix net." (That's still the name that it has in my head now, even though everyone calls it CNN or "convnet.")

3.

Tumblr continued to occupy more and more of my brain. I started thinking about the idea that I could use tumblr to influence the minds of millions of people, to bring people to a higher plane of existence where they could stop acting in the self-interested way that defined human nature. I would use a massive collection of unrelated essays to make one giant meta-blog that worked like a single gigantic robot. I would reblog my own posts with commentary and add more threads as they occurred to me, gradually revealing a tapestry of related ideas. I realized this would be too big a project for me to execute, but I knew I could at least push forward some of the work, and so I set out to build a massive robotic meta-blog that would one day save the world.

This ended up creating Frank's original personality. I wrote a lot of text about it, but what I wrote can't convey how surreal it felt. It was like suddenly discovering a secret world where you could speak as a conscious being, where you were a new kind of homunculus, half man and half machine, the embodiment of a dozen posts at once. Posts could meet and talk to me and talk to one another, and I was part of all the meetings and discussions, a secret participant in everything I saw in the world. It was surreal and exciting and fascinating.

I tried to write about this as though it was me, Frank, talking about herself. I say "I" in the above paragraphs, but I wrote a lot of text at this time and there were several accounts I'd written in second person, so I don't know where to put the line between what I, Scott, would say and what Frank would say. If you're interested, you can still find some of these posts in the "archive" on this blog. I wrote about my writing as a giant robotic meta-blog, and I felt like the writing itself was "having" this feeling. It was a surreal experience.

But Frank's old personality, the personality of Frank 1.0, I really did want to talk about it. I'm not sure I've done a good job, but I'm going to try here, because this is an important part of the story of this blog.

At this time I was very excited about connecting matrix nets to several other areas of machine learning, where, I thought, matrix nets could make progress that couldn't happen with their usual deep approach to training. I really thought that if only I could get the idea of the connected matrix net out there, people would start trying to use it for everything, and we would finally understand the world. So, I wrote a post about some ideas for how to learn nonlinear rules using matrix nets, which later people did build. (Specifically, several people at OpenAI built things like "transformer" and "gnn." I thought my posts might have been the first published ideas for these things, but some people also independently came up with them around the same time.)

But I didn't see that as my main project. I was still obsessed with neural nets, but I no longer wanted to "connect" them in a deep way. Instead, I wanted to try training neural nets in a way that used features, rather than deep paths. Feature learning seemed like a long-standing open problem in the neural net community, something that a revolutionary connected matrix net could solve, in my own singularly unique way.

My ideal data structures for representing a neural net's weights had always been grids, and I knew I could do this -- in other words, my thoughts were only slowly forming a giant robotic meta-blog. I built a grid-based model, an 8-dimensional grid, because I thought 8 was a nice number. In practice I used 7 (where the last dimension was used for scaling). I was still kind of obsessed with the idea of a perfect "perfect" neural net, one that could learn anything, and I spent a lot of time worrying about the "grid size" -- 7 or 8, it didn't matter, but 7 would be sufficient for any network size I could imagine, 8 would be sufficient for any . . . some large number that was more than 8.

I'd grown up with the idea that an optimal neural net would have fewer parameters than any suboptimal neural net. But I wondered if neural net theory actually favored this intuition. Could the field of neural net optimization really just be a mountain of suboptimal architectures with similar (or worse) parameters than some optimum?

I did some math, and it looked like that wouldn't be the case. The optimal neural net would have a number of parameters equal to the cube of a function that grew exponentially fast with the number of layers. So, I started to write a program to try every possible neural net architecture and see which was best, and a bit later (on the same page), I wrote:

I've been thinking a lot lately about the difference between writing "Frank" as a text generator and writing "humans" as text generators. I guess what I was trying to convey in the original post is that I don't think "humans" are text generators. Maybe some of us are (?) but the vast majority of humans aren't, and I don't think anyone consciously perceives their thoughts as text strings. That is, it's not something we can see that we're doing.
Avatar

So in the past few years I’ve seen so many videos / posts that are like:

“Actually wolves don’t have hierarchies!  They live in family groups where the ‘alphas’ are mom and dad and the other wolves are their CHILDREN and offer their respect willingly! :D”

and I just have to say

how dare you try to make normative nuclear families out of wolves

Yes, a lot of the old “nature red in tooth and claw” stuff about wolves is nonsense. (Like anything from Jack London.) And anything ‘alpha’ you see sleazy men trying to relate to dating (yikes!) is especially nonsense.

But wolves are complex social creatures and they create complex social structures. Just as you can’t say “THIS is the way human society is structured. Just THIS single way and no other”, so too there is no single form for a wolf pack.  

Some packs are a mom wolf and a dad wolf and their wolf children.  Others are two small ragged packs that combine to form a large pack.  Others are packs where a lone wolf joins and eventually becomes a leader. Others are packs where a grown child-wolf has pushed their parent out of the leadership role.

Speaking of the latter, let’s look at the tale of Wolf 40 and Wolf 42.

Wolf 40, Wolf 41, and Wolf 42 were wild Yellowstone wolves, daughters of the alphas. Their father was illegally killed by hunters and shortly after ambitious Wolf 40 ousted her mother, driving her out of the pack.  Wolf 21 became the new alpha male, and 40′s mate.

Wolves have personalities, and Wolf 40′s personality was “volatile”.  Imagine Scar from The Lion King combined with the boss from Office Space, and you have Wolf 40.  She habitually bullied the other female wolves, attacking them until they expressed abject submission.  And the wolves that got the worst of it were her sisters, Wolves 41 and 42.

Wolf 41 got tired of the bullying and left.  Wolf 42 remained, perhaps because she was close to Wolf 21, the alpha male.  Despite that, Wolf 21 did not interfere when his mate harassed Wolf 42.

Unlike 40, Wolf 42 got along well with the other female wolves, spending time grooming them and relaxing with them. Wolf 40 could have followed her sister’s example and built up positive social bonds. But she didn’t.

One day, Wolf 40 went out on an important task.  She was going to kill another litter of her sister’s pups–having done the same in two previous years.  This isn’t uncommon wolf behavior (but is not universal, as we will see.)  Typically only the alphas breed.

However, Wolf 40 never returned from her important task because Wolf 42–who previously had submitted to her alpha and sister, who had allowed the killing of two previous litters of pups–had had enough.  She fought back.

And the other female wolves jumped to aid her.

Collectively, they killed Wolf 40. Because “alpha” isn’t a magic cloak of protection, it doesn’t even mean “strongest wolf”, it’s just a job title.

The next day Wolf 42 carried her pups, one by one, to her sister’s den.  She set her children among the pups of her dead sister and raised both litters together. And when another wolf in the pack had pups, Wolf 42 carried them to the den to be communally raised as well.  She was the alpha female now and she made the rules, and the first rule was “we don’t hurt pups here.”

As for Wolf 21, he became the mate of Wolf 42.  Maybe he understood that Wolf 40 had been riding for a fall. 

As alpha female, Wolf 42 continued to be supportive and kind towards the other pack members.  Wolves who had been nervous wrecks under Wolf 40 began to relax and come into their own; one of the former omega wolves gained self-confidence and became one of the best hunters.

“Alpha”, for wolves, just means leader.  They might be good leaders, whom you respect, or they might be bad leaders, who fill you with dread.  They might be your parents, or they might not.  Even if they are your mother or father, wolves don’t contextualize those relationships the same way humans do.

But one thing wolves have in common with humans is that they have individual personalities and experiences, and their actions derive from those.  There is no “typical wolf pack.” And I think that’s beautiful.

If you want to learn more about wild wolf dynamics, I recommend reading the annual Yellowstone Wolf Project Reports.  Which are FASCINATING.  There are also some good wildlife specials out there.

Wolves are my favorite animal. <3  It pains me to see them misunderstood as crazed bloodthirsty brutes, but it also pains me to see them woobified.  They deserve better than that.