Longwriter – Increase llama3.1 output to 10k words

(github.com)

154 points | by taikon 38 days ago

10 comments

  • vessenes 38 days ago
    The sample output is interesting - it has highly suggestive chapter titles which read like pretty normal story beats. It seems like it's guiding itself on these, then able to chunk out longer form writing per chapter.

    For what it's worth, the writing is .. bland. In the way that only an LLMs writing can be -- relatively grammatically sound, and totally soulless. I will never think of the love story of Elizabeth and Thomas again, despite having read the entire thing.

    In early days of GPT-3, I experimented a lot with getting it respond as certain authors, and it was really quite excellent at that. This is one of the many things that seem likely to have been nerfed over time, I'd guess partly because human preference training just asks for bland responses, and partly because the injected prompts from OpenAI strongly discourage doing things related to real people, and those preferences are carried through, subtlely or not, into the augmented training data most open models tune on.

    • MintsJohn 37 days ago
      Interesting notion, I notice the same with image models, less stylistic more blandness on the latest generation. Only MJ seems to have style as a feature.
    • elfelf12 38 days ago
      Is it a copyright problem or a capitalist problem or why do we only get nerfed dumb chatbots?

      Would be interesting to really try hard and create a llm that can write novels in the style of an author. And skip the chat functionality!

      • zobzu 38 days ago
        I believe this is neither. I believe this is purely a form of control - not to make money later or lose less money - rather, I believe many are very afraid of how people would use an un-nerfed LLM.

        However, it's inevitable.

      • sReinwald 38 days ago
        Perhaps both. But I wonder if the incredible blandness of most chatbots is effectively just a regression towards the mean.

        Most AI companies try to train their bots on vast amounts of different data, and I suspect it's very difficult for that to result in very creative writing when you're training on works of fiction, as well as cooking recipes, Reddit comments and technical documentation.

      • roenxi 37 days ago
        Copying writers is probably a copyright thing. But the experience with generative AI for images was that, at least for the early models, it was good to put things like "masterpiece, highest quality" in the prompt. The model biased towards average rather than trying to maintain a high standard. The more general problem here could easily be that people haven't figured out how to prompt interesting writing from an LLM yet.

        Although my personal theory would be that LLMs are just writing how someone without an ego or firsthand knowledge would write - it has a bunch of different angles it could take but has no particular reference to draw on to determine which is true. Great human writers are often cataloguing their extra-literary experiences. How is ChatGPT supposed to be inspired by a beautiful sunset to capture it in a way that has never been done before? It is capable of the writing part, but the inspiration part is a lot harder for it.

        • theturtle32 37 days ago
          The notion that generating something "in the style of" a human creator is a violation of copyright is categorically false. Copyright is only infringed when a work is substantially copied. Generating something new but with a similar feel is fair game. That OUGHT to be (but, obnoxiously, seems not to be) universally uncontroversial.

          Human creators might bristle and find it distasteful to have works automatically generated in a style they spent a long time honing, but it is most certainly not a violation of copyright.

        • 6510 37 days ago
          There is nothing between the lines.
      • numpad0 37 days ago
        LLM can continue anything, chat is simply what worked best so far. Outputs being bland and soulless, and lacking in global structure, if I may add, is just architectural. There's nothing behind that.
  • mmaunder 38 days ago
    What the difference between this and using chat history to concatenate outputs and prompting with something like “Now write the next section” repeatedly? I’ve done that with NotebookLM and it’ll write a complete fictional story based on sources, for example.
    • dotnet00 38 days ago
      In my testing, that often causes the model to 'drift' and ramble wildly compared to just getting one long output from the very start.

      The issue is probably that when you split it by just asking for the next section, you're asking it to figure out how to continue from a block that wasn't written with the awareness that it'd have to add on to it.

      From the diagram on the repo, I guess this first plans out the structure for each block, and generates the blocks based on the plan.

    • LeoPanthera 38 days ago
      Most LLMs are trained to write "complete" outputs. So each section will end up being like a tiny self-contained short book. Without manual editing they will not create long narratives.
      • b33j0r 38 days ago
        The notion that an LLM will be used or can be used mostly for a one-shot request/response has been one of the most idiosyncratic things about the first and second waves of this tech.

        Like, not only can it not “make a complete app in 45 seconds,” I almost never even want that.

    • ben_w 37 days ago
      My experience is that by default it's like that game where you fold some paper, draw just up to the fold, pass the paper to the next player and they continue, and then you unfold it and see what came out: https://i.kym-cdn.com/photos/images/facebook/001/782/166/960...

      With editing or lots of regeneration it can still work, of course.

      (That said, I last tried this with the original ChatGPT…)

    • thomasahle 38 days ago
      It would be the same if the model was "raw", trained only on text completion. But all models these days are RLHF'ed on (prompt, answer) pairs, so unfortunately they can get confused if the prompt already contains part of an answer.
      • elfelf12 38 days ago
        I think base models are far superior to those boring instruct tuned models. I would rather have a good text completionist than a chat bot. But as far as i know i am in a minority there.
  • ed 38 days ago
    Paper: https://arxiv.org/abs/2408.07055

    The model is stock llama, fine tuned with a set of long documents to encourage longer outputs.

    Most of the action seems to happen in an agent.

  • danng87 38 days ago
    Interesting project!

    Does anyone know how LongWriter handles maintaining coherence and structure in longer outputs? Also, are there specific strategies or parameters recommended for fine-tuning LLaMA 3.1 with this setup to maximize the quality of generated text?

    • yawnxyz 38 days ago
      How do people eval these very long outputs?

      I've never figured that out (and no I can't just... read all of them)

      • Multicomp 38 days ago
        I don't know how to answer your question. But. I will say that I could see a future where one has a brainstormed setting / plot outline / concept and one could have the LLM output a first draft of whatever length, then make changes / tweaks to the story / copy over time.

        The hardest part of writing for me is the first draft. Editing an existing copy to my own human artistic vision is much easier. No, this character doesn't act like this, he acts like that.

        Presuming you don't have an allergic reaction to AI affected writing copy (even though the publishing houses are going to outsource their copyedits and style guide edits to LLMs, that is not hard to predict), an author could have the copy start with the souless and then hand edit until they like it from there.

        Then it makes the copy go into hybrid world where AI was used to be a power tool, not the entire product. Copyright law may frustrate that for a time where if say over 5% of the final copy is AI-generated, it is ineligible for copyright protections, but otherwise, there will be stories and the best stories win.

        1. Hand crafted on a fountain pen through all the edits, digitized to an opendoc (ok who are we kidding, .docx but I can dream for open file formats)

        2. This story was started and is digital native through scrivener / yWriter and eventually dumped to a .docx

        3. This story started in an LLM chat response and edited muchly to match the artist's human vision

        All 3 stories will exist. and there will be a sea of slop that used (3) and then barely edited a thing, hoping to sell a book by SEO tag manipulation and an 'eye-catching'/lurid cover, just as there is now with (2) hastily thrown together rip-offs of others text.

        But you can believe that I will be glad to go all Star Trek Holodeck on my idea concepts for books and tabletop campaigns.

        Computer, give me a questline for a faction called the Silver Carders, there's a catfolk named Marvin who is the adopted son of a human named Doug Alvaro and he is the old flame of the founder of the faction and there's political intrigue that X Y and Z, please find a good mix-in for these 4-7 TV tropes links I like to play with, go.

        Ok now swap out the absentminded professor gadgeteer with a cloud cuckoolander grandma mechanic.

        Ok now find me a few entrypoints to this faction for my party characters who are currently A, B, and C.

        Oh yeah, the max context this stuff will be useful for will be great.

        Can I do that now with manual digital tools? Of course. But this lessens the activation energy/boilerplate of typing this stuff up a lot.

        Will long-term it make future generations unable to cope without the tool? Yes. Just like I cannot use a slide rule or do any geometry outside of my class, I have computer tools for that. LLMs will be a tool that after 20 years will be normalized enough.

        Granted it will be odd when we have 3-book series come out covering a recent current events that captures the public's imagination within weeks of the event, instead of the 3-years-later that usual entertainment media like books and movies take today.

        Or odd when people can pay to have their own version of the story made, either inserting characters or 'what if'ing the story where they can pay to alter a single plot point and see how the characters react and how that modifies the overall story.

        We will all be more literarily conversant whether we want to or not, and I'm not sure whether I like that or I'm annoyed by it yet. Too soon to tell.

        • yawnxyz 38 days ago
          I think some abstraction will need to occur, or it's just too much information for us to ever take in and hold all at once... I think this goes past my problem of "I can't eval long outputs" and your quest of pick-and-edit style. Code assistants are in the same boat right now too.

          It looks like all these knowledge fields are converging into the same problem

  • 8bitsrule 38 days ago
    Obviously there needs to be some oversight that limits distribution of machine-written articles 'created' by people that aren't already competent in the subject-area to know whether the content is trustworthy. Otherwise a lot of damage can be done in a very short time. (Pandora's box and all that.)
  • hshshshsvsv 37 days ago
    Will this ever be able to generate anything meaningful or useful?
    • Netcob 37 days ago
      We'll probably see a lot of this:

      Alice writes short prompt -> Alice's LLM generates oversized slop -> Bob's LLM generates summary -> Bob skims the summary.

      Some information may be lost in this process due to hallucinations and errors, but Nvidia and OpenAI will make a lot of money causing and fixing problems.

  • wkat4242 38 days ago
    Really interesting. I wonder if you can do this on ollama too.
    • underlines 37 days ago
      vLLM is simple to set up, use docker and make sure your backend (ubuntu or WSL ubuntu or whatever) has GPU support installed.
  • jw12 37 days ago
    Looks super interesting!
  • fitsumbelay 38 days ago
    this looks cool. any plans to support 3.2?
  • alwinaugustin 38 days ago
    How to use this with the local ollma setup