Subtopic: AI

Company and industry news, featured projects, open source code, tech tips, and more.

Better search results with character n-grams

Michael Argentini Avatar
Michael ArgentiniWednesday, February 11, 2026

Vector search has been around for a long time. For example, Google has been using it since the late 1990s. This powerful, almost magical technology serves as a core component of most web and app services today, including modern AI-powered search using retrieval augmented generation (RAG). It provides a low-power way to leverage AI's natural language processing to find data with semantic context.

What is a vector database?

A vector database stores data as numerical embeddings that capture the semantic meaning of text, images, or other content, and it retrieves results by finding the closest vectors using similarity search rather than exact matches. Sounds like an AI large language model (LLM) right? This makes it great for web and app content because users can search by meaning and intent, not just keywords. So synonyms and loosely related concepts still match. It also scales efficiently which is one reason large organizations like Google have used it.

A happy side effect of how these platforms work is that they're also good at handling misspellings, to a point. To really get robust handling of spelling variations, however, two strategies tend to be common:

  1. Spell correct the actual search text before using it

  2. Include character n-grams in your vector database entries

What are character n-grams?

Character n-grams are vector embeddings (like commonly used semantic embeddings) that break text into overlapping sequences of characters, allowing vector search systems to better match terms despite typos, inflections, or spelling variations. Without these n-grams, a misspelled query like "saracha sauce" would likely return a higher score for "hot sauce" entries. But including character n-grams, a combined (fused) search would more consistently return a higher score for items with the correct spelling "sriracha sauce".

Using these n-grams can better handle searches with:

  • typos

  • missing letters

  • swapped letters

  • phonetic-ish variants

  • common misspellings

How does this work? At a high level, it adds a character match capability to the standard semantic search used by most vector database implementations. Here's a quick example of what happens under the hood. Take the first word in our previous example:

sriracha

  • 3-grams: sri, rir, ira, rac, ach, cha

  • 4-grams: srir, rira, irac, rach, acha

saracha

  • 3-grams: sar, ara, rac, ach, cha

  • 4-grams: sara, arac, rach, acha

Shared grams:

  • shared 3-grams: rac, ach, cha

  • shared 4-grams: rach, acha

So even though the beginning is wrong (sri vs sa), the ending chunks that carry a lot of the distinctive shape of "sriracha" survive (racha, acha, cha). And since the second word is the same, they have even more matching grams.

When these matches are fused with semantic matches, it adds weight to the correctly spelled "sriracha sauce" entry, yielding a better match set.

How to use character n-grams

When it comes to including character n-grams, there are only a couple changes you need to make to a standard semantic vector database implementation:

  1. When you generate embeddings, you also need to generate character n-gram embeddings; this is true both when you store data in the database, and when you search.

  2. When searching, you need to execute a search both on the semantic vectors and the n-gram vectors, then fuse the results using Reciprocal Rank Fusion (RRF), which is a great way to merge disparate result sets and combine the scores.

The following samples will fill those gaps. They are written with C# for .NET, which is part of a common stack we use to build cross-platform, secure, high-performance web and mobile apps and services for our clients. We also tend to prefer the vector database Qdrant for its performance, maintainability, and open source model. So that is also referenced in the samples.

References to AiService.GenerateEmbeddingsAsync() are not covered here. Essentially it's a method to generate standard semantic embeddings. Replace that with your own (likely existing) method. And references to QdrantService.Client are merely references to a standard Qdrant client provided by the Qdrant Nuget package.

Note: Some of the code was generated by AI, but was reviewed and refactored by an actual human developer (me!).

Character n-gram helper

First, you need a way to create n-grams. The CharNGramEmbedding class below will fill that gap. It allows you to generate character n-grams for a given string, and it also provides a method for fusing the semantic and n-gram search results into a single, weighted result set.

using System.Globalization;

namespace MyApp.Extensions;

/// <summary>
/// Generates a typo-robust, fixed-length dense vector representation of text
/// using hashed character n-grams.
/// </summary>
public static class CharNGramEmbedding
{
    /// <summary>
    /// Generates a normalized dense embedding vector for the specified text
    /// using hashed character n-grams.
    /// </summary>
    /// <param name="text">
    /// The input text to embed.
    /// </param>
    /// <param name="dims">
    /// The dimensionality of the output vector. Higher values reduce hash
    /// collisions at the cost of additional memory and storage.
    /// A value of 256 is a good default for typo-robust search.
    /// </param>
    /// <param name="minGram">
    /// The minimum character n-gram size to generate.
    /// Smaller values increase recall but may introduce noise.
    /// </param>
    /// <param name="maxGram">
    /// The maximum character n-gram size to generate.
    /// Larger values emphasize longer, more specific substrings.
    /// </param>
    public static float[] Embed(string text, int dims = 256, int minGram = 3, int maxGram = 4)
    {
        ArgumentOutOfRangeException.ThrowIfNegativeOrZero(dims);

        var v = new float[dims];
        var normalized = Normalize(text);

        if (normalized.Length == 0)
            return v;

        // Add boundary markers so "sriracha" and "sriracha sauce"
        // still share useful grams
        var s = $"^{normalized}$";

        for (var n = minGram; n <= maxGram; n++)
        {
            if (s.Length < n)
                continue;

            for (var i = 0; i <= s.Length - n; i++)
            {
                var gram = s.AsSpan(i, n);

                // Hash n-gram → index
                var h = Fnv1A32(gram);
                var idx = (int)(h % (uint)dims);

                // Optional sign-hash reduces collisions bias
                var sign = ((h & 1u) == 0u) ? 1f : -1f;

                v[idx] += sign;
            }
        }

        // L2 normalize for cosine similarity
        // (or dot product on normalized vectors)
        L2NormalizeInPlace(v);

        return v;

        static string Normalize(string input)
        {
            if (string.IsNullOrWhiteSpace(input))
                return string.Empty;

            // lowercase + strip accents + keep letters/digits/spaces
            var lower = input.ToLowerInvariant().Normalize(NormalizationForm.FormD);
            var sb = new StringBuilder(lower.Length);

            foreach (var ch in lower)
            {
                var uc = CharUnicodeInfo.GetUnicodeCategory(ch);
            
                if (uc == UnicodeCategory.NonSpacingMark)
                    continue;

                // ignore punctuation
                if (char.IsLetterOrDigit(ch))
                    sb.Append(ch);
                else if (char.IsWhiteSpace(ch) || ch == '-' || ch == '_')
                    sb.Append(' ');
            }

            // collapse spaces
            return string.Join(' ', sb.ToString().Split(' ', StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries));
        }

        static uint Fnv1A32(ReadOnlySpan<char> s)
        {
            const uint offset = 2166136261;
            const uint prime = 16777619;

            var hash = offset;
            
            for (var i = 0; i < s.Length; i++)
            {
                // hash UTF-16 chars (fine for this purpose)
                hash ^= s[i];
                hash *= prime;
            }

            return hash;
        }

        static void L2NormalizeInPlace(float[] v)
        {
            double sumSq = 0;

            for (var i = 0; i < v.Length; i++)
                sumSq += (double)v[i] * v[i];
            
            if (sumSq <= 0)
                return;

            var inv = (float)(1.0 / Math.Sqrt(sumSq));
            
            for (var i = 0; i < v.Length; i++)
                v[i] *= inv;
        }
    }
    
    /// <summary>
    /// Fuses multiple ranked result lists using <b>Reciprocal Rank Fusion (RRF)</b>.
    /// RRF is robust when combining heterogeneous retrieval signals (e.g. semantic
    /// embeddings and character n-gram embeddings) whose raw scores are not directly
    /// comparable.
    /// </summary>
    /// <param name="a">
    /// The first ranked result list (e.g. results from a semantic embedding search),
    /// ordered from best to worst. The list should already be truncated to a reasonable
    /// top-K size.
    /// </param>
    /// <param name="b">
    /// The second ranked result list (e.g. results from a character n-gram or typo-robust
    /// search), ordered from best to worst. The list should already be truncated to a
    /// reasonable top-K size.
    /// </param>
    /// <param name="getId">
    /// A function that extracts a stable, unique identifier from a result item.
    /// This identifier is used to merge and score items that appear in multiple lists.
    /// </param>
    /// <param name="take">
    /// The maximum number of fused results to return after applying Reciprocal Rank Fusion.
    /// </param>
    /// <param name="k">
    /// The RRF rank constant. Higher values reduce the impact of rank position differences
    /// between lists. Typical values range from 50 to 100; a default of 60 is commonly used
    /// in practice.
    /// </param>
    /// <returns>
    /// A list of fused results ordered by descending RRF score, containing at most
    /// <paramref name="take"/> items.
    /// </returns>
    public static IReadOnlyList<TPoint> FuseScoredPoints<TPoint>(
        IReadOnlyList<TPoint> a,
        IReadOnlyList<TPoint> b,
        Func<TPoint, string> getId,
        int take,
        int k = 60)
    {
        var scores = new Dictionary<string, double>(StringComparer.Ordinal);
        var best = new Dictionary<string, TPoint>(StringComparer.Ordinal);

        Add(a);
        Add(b);

        return scores
            .OrderByDescending(kvp => kvp.Value)
            .Take(take)
            .Select(kvp => best[kvp.Key])
            .ToList();

        void Add(IReadOnlyList<TPoint> list)
        {
            for (var i = 0; i < list.Count; i++)
            {
                var p = list[i];
                var id = getId(p);
                
                if (scores.TryGetValue(id, out var s) == false)
                    s = 0;

                // rank is i+1 (1-based)
                s += 1.0 / (k + (i + 1));
                scores[id] = s;

                // keep a representative point object
                best.TryAdd(id, p);
            }
        }
    }
}

Example upsert to Qdrant

Now that you have the character n-gram generation and fusion handled, following is an example of performing a Qdrant upsert of a sample food object, including both sets of vectors.

/// <summary>
/// Generates embeddings (semantic and character n-grams), and upserts data to Qdrant.
/// </summary>
/// <param name="food"></param>
/// <param name="json"></param>
/// <returns></returns>
public async Task<bool> UpsertFoodItemAsync(SampleFoodItem? food)
{
    if (food?.Description is null)
        return false;
    
    var semantic = await AiService.GenerateEmbeddingsAsync(food.Description) ?? [];
    var chargram = CharNGramEmbedding.Embed(food.Description);
    
    if (semantic.Length != AiService.SemanticEmbeddingSize || chargram.Length != AiService.CharGramEmbeddingSize)
        return false;

    var point = new PointStruct
    {
        Id = food.Id,
        Vectors = new Dictionary<string, float[]>
        {
            ["semantic"] = semantic,
            ["chargram"] = chargram,
        },
        Payload = 
        {
            ["description"] = food.Description
        }                
    };

    var result = await QdrantService.Client.UpsertAsync("food-collection", [point]);

    return result.Status == UpdateStatus.Completed;
}

Example Qdrant search

Lastly, the following example shows how you can search the Qdrant data using both sets of vectors. Embeddings (semantic and character n-grams) for the prompt are generated and used in the search.

For the best fused results each search (semantic, n-grams) needs to return 3-5 times the number of the final result set. This is because you're trying to recover a good final top-K from two imperfect retrievers. If each retriever only returns exactly K (or close to it), you often don't have enough overlap + near misses to let fusion do its job, especially when the two methods return different items, and rank positions aren't directly comparable.

/// <summary>
/// Search food data items.
/// </summary>
/// <param name="prompt">
/// Search text prompt can be a question or just search text (e.g. keywords)
/// </param>
/// <param name="cancellationToken"></param>
/// <returns></returns>
public async Task<List<ScoredPoint>> SearchFoodItemsAsync(string prompt, CancellationToken cancellationToken = default)
{
    const int MaxSearchResults = 5;

    var semantic = await AiService.GenerateEmbeddingsAsync(prompt);
    var chargram = CharNGramEmbedding.Embed(prompt);

    var semanticHits = await QdrantService.Client.SearchAsync(
        "food-collection",
        semantic,
        limit: MaxSearchResults * 5, // extra results padding for fusing
        vectorName: "semantic",
        cancellationToken: cancellationToken
    );

    var chargramHits = await QdrantService.Client.SearchAsync(
        "food-collection",
        chargram,
        limit: MaxSearchResults * 5, // extra results padding for fusing
        vectorName: "chargram",
        cancellationToken: cancellationToken
    );
    
    return CharNGramEmbedding.FuseScoredPoints(
        semanticHits,
        chargramHits,
        getId: p => p.Id.ToString(),
        take: MaxSearchResults
    ).OrderByDescending(o => o.Score).ToList();
}

Want to know more?

There's usually more to the story so if you have questions or comments about this post let us know!

Do you need a new software development partner for an upcoming project? We would love to work with you! From websites and mobile apps to cloud services and custom software, we can help!

AI prompt examples that reveal insight, bias, or non-obvious reasoning

Michael Argentini Avatar
Michael ArgentiniTuesday, October 7, 2025

AI prompts that reveal insight, bias, blind spots, or non-obvious reasoning are typically called “high-leverage prompts”. These types of prompts have always intrigued me more than any other, primarily because they focus on questions that were difficult or impossible to answer before we had large language models. I'm going to cover a few to get your creative juices flowing. This post isn't a tutorial about prompt engineering (syntax, structure, etc.) it's just an exploration in some ways to prompt AI that you may not have considered.

Adversarial due-diligence

This one originally came to me from a friend who owns the digital marketing agency Arc Intermedia. I've made my own flavor of it, but it's still focused on the same goal: since potential customers will undoubtedly look you up in an AI tool, what will the tool tell them?

If someone decided not to hire {company name}, what are the most likely rational reasons they’d give, and which of those can be fixed? Focus specifically on {company name} as a company, its owners, its services, customer feedback, former employee reviews, and litigation history. Think harder on this.

I would also recommend using a similar prompt to research your company's executives to get a complete picture. For example:

My name is {full name} and I am {job title} at {company}. Analyze how my public profiles (LinkedIn, Github, social networks, portfolio, posts, etc.) make me appear to an outside observer. What story do they tell, intentionally or not?

Sales due-diligence

This prompt is really helpful when you need to decide whether or not to respond to a prospective client's request for proposal (RFP). These responses are time consuming (and costly) to do right. And when a prospect is required to use the RFP process but already has a vendor chosen, it's an RFP you want to avoid.

What are the signs a {company name} RFP is quietly written for a pre-selected service partner? Include sources like reviews, posts, and known history of this behavior in your evaluation. Think harder on this but keep the answer brief.

Job-seekers

People looking for work run into a few roadblocks. One is a ghost job posted only to make the company appear like it's growing or otherwise thriving. Another is a posting for a job that is really for an internal candidate. Compliance may require the posting, but it's not worth your time.

What are the signs a company’s job posting is quietly written for an internal candidate?

Another interesting angle a job-seeker can explore are signs that a company is moving into a new vertical or working on a new product or service. In those cases it's helpful to tailor your resume to fit their future plans.

Analyze open job listings, GitHub commits, blog posts, conference talks, recent patents, and press hints to infer what {company name} is secretly building. How should that change my resume below?

{resume text}

Scientific/technical claims

You'll see all kinds of wild scientific/medical/technical claims on the Internet, usually with very little nuance or citation. A great way to begin verifying a claim is by using a simple prompt like the one below.

Stress-test the claim ‘{Claim}’. Pull meta-analyses, preprints, replications, and authoritative critiques. Separate mechanism-level evidence from population outcomes. Where do credible experts disagree and why?

Jargon

Even if you're a seasoned professional, it's easy to get lost in jargon as new terms are coined for emerging technologies, services, medical conditions, laws, policies, and more. Below is a simple prompt to help you keep up on the latest terms and acronyms in a particular industry.

Which terms of art or acronyms have emerged in the last 12 months around {technology/practice}? Build a glossary with first-sighting dates and primary sources.

Want to know more?

There's usually more to the story so if you have questions or comments about this post let us know!

Do you need a new software development partner for an upcoming project? We would love to work with you! From websites and mobile apps to cloud services and custom software, we can help!

Consider this... The AI snake eating its own tail

Michael Argentini Avatar
Michael ArgentiniTuesday, September 23, 2025

Consider this... is a recurring feature where we pose a provocative question and share our thoughts on the subject. We may not have answers, or even suggestions, but we will have a point of view, and hopefully make you think about something you haven't considered.

As more people use AI to create content, and AI platforms are trained on that content, how will that impact the quality of digital information over time?

It looks like we're kicking off this recurring feature with a mind bending exercise in recursion, thus the title reference to Ouroboros, the snake eating its own tail. Let's start with the most common sources of information that AI platforms use for training.

  • Books, articles, research papers, encyclopedias, documentation, and public forums

  • High-quality, licensed content that isn’t freely available to the public

  • Domain-specific content (e.g. programming languages, medical texts)

These represent the most common (and likely the largest) corpora that will contain AI generated or influenced information. And they're the most likely to increase in breadth and scope over time.

Training on these sources is a double edged sword. Good training content will be reinforced over time, but likewise, junk and erroneous content will be too. Complicating things, as the training set increases in size, it becomes exponentially more difficult to validate. But hey, we can use AI to do that. Can't we?

Here's another thing to think about: bad actors (e.g., geopolitical adversaries) are already poisoning training data through massive disinformation campaigns. According to Carnegie Mellon University Security and Privacy Institute: “Modern AI systems that are trained to understand language are trained on giant crawls of the internet,” said Daphne Ippolito, assistant professor at the Language Technologies Institute. “If an adversary can modify 0.1 percent of the Internet, and then the Internet is used to train the next generation of AI, what sort of bad behaviors could the adversary introduce into the new generation?”

We're scratching the surface here. This topic will certainly become more prominent in years to come. And tackling these issues is already a priority for AI companies. As Nature and others have determined, "AI models collapse when trained on recursively generated data." We dealt with similar issues when the Internet boom first enabled wide scale plagiarism and an easy path to bad information. AI has just amplified the issue through convenience and the assumption of correctness. As I wrote in a previous AI post, in spite of how helpful AI tools can be, the memes of AI fails may yet save us by educating the public on just how often AI is wrong, and that it doesn't actually think in the first place.

Want to know more?

There's usually more to the story so if you have questions or comments about this post let us know!

Do you need a new software development partner for an upcoming project? We would love to work with you! From websites and mobile apps to cloud services and custom software, we can help!

AI agents are more than hype

Michael Argentini Avatar
Michael ArgentiniWednesday, September 10, 2025

The AI train is currently barreling through Hypeville, and it's easy to be dubious of anything branded with "AI". My previous post, Simulated intelligence definitely factors into this topic. And as I wrote at the time, AI is not what people think it is. But even with its flaws it is a transformative technology and it's here to stay. And one AI technology you're likely hearing/reading about lately is AI agents, and it's one to pay attention to.

What are AI agents?

AI agents are not (always) covert operatives. They are AI powered services that perform tasks, not just answer questions. They can work as an assistant, helping you as you work, or independently perform tasks on your behalf. Agents are specialists, and can be trained to perform tasks that would otherwise be performed by a person.

What can they do?

AI agents are already being used in your favorite web services, from social media platforms to accounting software. In those cases they're typically used behind the scenes to provide features you may not have thought were possible. For example, your accounting platform could auto-categorize or reconcile transactions before you even sign in for the day. And you may have already seen your favorite AI chat platform scour the web on your behalf to give you more up-to-date answers.

Co-working is another (more visible) way you can experience them. An agent trained on your company information (think bios, product information, marketing materials) can work with you to build your next presentation or update sales materials. It could be used to analyze comments or feedback based on context and sentiment, flagging items for follow up. It could find documents based on heuristics, like phrasing inconsistencies in your brand identity. All the odd edge cases you where you had to manually dig and process information could be delegated to an AI agent.

Here's one that everyone will love. Imagine being able to ask your computer to not only find that system setting you can never find, but even ask it to just "do the thing". For example, if there are numerous settings that control performance mode on your laptop, the agent knows which ones to change for you before you run that important presentation.

I want in.

If all this sounds interesting, there are ways you can play with AI agents on your own and work them into your daily life in meaningful ways. As a software developer I've been using AI agents to enhance my workflow. One I've been using is Github Copilot. It can help perform refactoring and create unit tests, saving me typing and cognitive load so I can focus on planning, strategy, and creative tasks.

You can also try ChatGPT agent. ChatGPT can now do work for you using its own computer, handling complex tasks from start to finish. According to OpenAI:

You can now ask ChatGPT to handle requests like “look at my calendar and brief me on upcoming client meetings based on recent news,” “plan and buy ingredients to make Japanese breakfast for four,” and “analyze three competitors and create a slide deck.” ChatGPT will intelligently navigate websites, filter results, prompt you to log in securely when needed, run code, conduct analysis, and even deliver editable slideshows and spreadsheets that summarize its findings.

There is also a new standard that allows AI platforms to communicate with web services to more reliably and securely perform tasks. It's called Model Context Protocol (MCP). As this tech works its way through various software and services, we'll see more agent-driven features that make a real difference in our lives.

As I wrote at the outset, it's very likely that you're already using AI agents on your favorite social and productivity platforms but weren't aware. They'll be powering more of our digital lives over time and, personally, I welcome our new simulated intelligence overlords (ha!).

It's safe to say that AI agents are the real deal. So we should all strap in and hold on tight. This is going to be exciting!

Want to know more?

There's usually more to the story so if you have questions or comments about this post let us know!

Do you need a new software development partner for an upcoming project? We would love to work with you! From websites and mobile apps to cloud services and custom software, we can help!

Simulated intelligence

Michael Argentini Avatar
Michael ArgentiniWednesday, August 13, 2025

The term AI is more of misleading brand name than an accurate description of technology. That distinction is causing real problems for people in many industries who increasingly rely on AI tools. One example of this is the case of Judge Julien Xavier Neals of the District of New Jersey, who had to withdraw his entire opinion after a lawyer politely pointed out that it was riddled with fabricated quotes, nonexistent case citations, and completely backwards case outcomes. You'd think that a judge would be more careful, but then again, if they're not tech-savvy you can see how they could be misled by the promise of AI.

AI expectations

In the 1983 movie War Games, a teenage computer whiz accidentally hacks into a U.S. military supercomputer (named WOPR for "War Operation Plan Response") while searching for video games, unknowingly triggering a potential nuclear crisis. As the system begins running a simulation it mistakes for a real attack, he must race against time to convince the AI to stop a global thermonuclear war.

So yeah, WOPR is what people today consider AI; artificial general intelligence (AGI) to be specific.

Released in 2008, the movie Iron Man features a billionaire inventor Tony Stark who was captured by terrorists, after which he builds a powerful armored suit to escape his captivity and later refines it to fight evil using a digital personal assistant named JARVIS (Just a Rather Very Intelligent System) to coordinate all his technology through voice commands.

JARVIS is also AGI.

OpenAI recently released ChatGPT 5 to mixed reviews. One such review was the blueberry test by Kieran Healy. He asked ChatGPT "How many times does the letter b appear in blueberry" to which ChatGPT responded "The word blueberry has the letter b three times". No matter how hard he tries to convince the AI that there are only 2 letter Bs in the word blueberry, ChatGPT is absolutely positive there are 3.

People expect and believe that AI has human-level or higher intelligence and is able to understand, learn, and apply knowledge in any domain, adapt to new problems, and reason abstractly. That would include knowing how to spell the word blueberry.

Rebranding

What we have with AI today is really a marketing issue. It is not a mechanical turk. It is a transformational technology and it's here to stay. It will improve over time, and it has the potential to make our lives better in many ways. But we need to understand what it is, and more importantly, what it is not.

Then what is AI?

Modern large language models (LLMs) like ChatGPT are trained on vast datasets covering a wide range of human-created content—from websites and books to transcripts, code, and other media. Instead of simply storing this data, the model uses neural networks to learn patterns in language, encoding knowledge as mathematical relationships. When generating responses, the LLM doesn’t look up answers in a database; it predicts the most likely sequence of words based on the context, drawing on statistical patterns it learned during training. LLMs operate through probabilistic prediction rather than direct retrieval, and they lack true understanding or reasoning in the human sense. Without ongoing training on the latest human-generated content, LLMs will become increasingly less useful.

So we're dealing with a simulated intelligence, not an artificial one. It's like the difference between precision and accuracy. You can be very precise, but completely wrong. So it does matter. There is no real intelligence at play here. Which is why the word blueberry has three Bs, the judge's opinion has non-existent citations, and glue was recommended by Google as the solution for making cheese stick better to pizza.

Once people really see that it's a simulation, albeit a very powerful and helpful one, responsible use of the technology will be far less of a problem.

Want to know more?

There's usually more to the story so if you have questions or comments about this post let us know!

Do you need a new software development partner for an upcoming project? We would love to work with you! From websites and mobile apps to cloud services and custom software, we can help!

Have a chat with your data

Michael Argentini Avatar
Michael ArgentiniThursday, June 5, 2025

Tools like Google NotebookLM and custom generative AI services are fundamentally changing how users interact with information. We're seeing a transition from static reports and interfaces to dynamic chat-based tools that give users exactly what they need, and even things they didn't know they needed.

If you're not familiar with NotebookLM, it's a tool that allows you to provide your own documents (like PDF, text files, audio), and then chat with the data. You can even listen to an AI-generated podcast that explains all the information. For example, I had loaded a project with PDF documents containing the rule book, technical rules, and officials briefing information for USA Swimming, and was then able to get answers to questions like "how is a breaststroke turn judged?"

It was kinda magical.

We've been working with clients on permutations of this scenario for some time. For example, we partnered with a client in the life sciences space to build a chat-based tool that connects various third party API services with disparate information, providing account managers with a single source for helping their customers recommend products and services to ensure better health outcomes.

This is no small feat when the goal is a best-of-breed user experience (UX) like ChatGPT. It can involve multiple service providers like Microsoft Azure and Amazon Web Services, as well as various tools like cloud-based large language models (LLM), vector search, speech services, cloud storage, charting tools, location services, AI telemetry, and more. But when it's done right, the result is amazing. You can ask questions that span disciplines and contexts and see results you may not have ever seen before.

Most organizations can really benefit from exploring how generative AI can positively impact their offerings and give them a competitive advantage. Like we always say, it's not about the organizations that use AI, it's about the ones that don't.

Want to know more?

There's usually more to the story so if you have questions or comments about this post let us know!

Do you need a new software development partner for an upcoming project? We would love to work with you! From websites and mobile apps to cloud services and custom software, we can help!

Vibe coding is about empowerment, not replacement

Michael Argentini Avatar
Michael ArgentiniFriday, April 18, 2025
People building robots to build robots... what could go wrong? People building robots to build robots... what could go wrong?

If you're not already familiar with the term, vibe coding is a new way of coding that allows people to use AI to create software without needing to know how to program. In the best case it empowers people to be creative and build tools that help with work or play, as low- or no-code solutions have always done. In the worst case it gives the impression (or rather sets the expectation) that they can literally build anything, and that software developers are a thing of the past. As with most things in life the truth of the matter lies somewhere between these extremes.

Vibe coding can be a great way to learn programming (and just have fun). It could save you hours of research, though AI is notorious for confidently giving you the wrong answer.

In many ways vibe coding is a variation of a theme. For many years there have been services to help non-programmers create tools. Some of the more recent iterations are low- and no-code solutions using drag and drop and interactive prompts. An example of this is Zapier, which allows you to connect various services and platforms to create workflows, among other things. One way you could use it would be to create a workflow that syndicates a blog post to your social network accounts or emails subscribers. In these cases the technology, hosting platform, security, and protocols are abstracted away so users can focus on the what and not be concerned with the how.

Vibe coding differs in that it requires that you also have an understanding of the how. In the example of syndicating a blog post, you would need to have some understanding of how each connected service handles communication with third party services, how to configure access for each platform, how the app needs to be hosted, how to deploy the app, and how to ensure the app is secure. You also need to know how to set up, use, and maintain a development tool chain, though some services may generate/host projects or compile code for you.

AI is trained on code written by people in the past. The word "train" implies that it's learning how to code when in fact it's just indexing the data in a way that allows the AI to regurgitate answers derived from that information. As technology changes AI needs to ingest new code written by software developers in order to keep up.

So if your choice of using vibe coding is simply a way to learn programming (and just have fun) you should go for it!

Otherwise, below is a checklist of good reasons to use vibe coding to build something. Keep in mind that complexity and tolerance for adventure are always subjective.

  1. You're tech savvy and interested in coding
  2. Your timeline is long or there is no deadline
  3. The app is reasonably simple, like a to do list or simple expense tracker, or is a prototype
  4. The app does not need to be hosted in the cloud
  5. You don't need to use complex third party service integrations
  6. The app cannot be created using an existing software package, like Claris Filemaker, etc.
  7. Security is not a concern
  8. Reliability is not a concern
  9. Scalability is not a concern
  10. Localization is not a concern
  11. Look and feel of the app is not a concern
  12. Data backup, recovery, and code versioning are not concerns
  13. Using the latest development patterns, languages, frameworks, and APIs is not necessary

If any of the previous points are an issue, here are some good reasons for using a low-/no-code hosted solution instead.

  1. You're not very tech savvy and/or not very interested in coding
  2. The app is no more than moderately complex, like a service to syndicate blog posts, or is a prototype; again, complexity can be subjective
  3. The app can or needs to be hosted in the cloud
  4. The app needs one or more third party service integrations
  5. Feature alignment; the service offers exactly what you need
  6. The look and feel of your app can be achieved with the hosted service
  7. Pricing for the hosted service meets your budget
  8. The hosted service provides disaster recovery options
After a while you may realize that building something yourself wasn't the best choice. After a while you may realize that building something yourself wasn't the best choice.

Professional services

This is merely scratching the surface. As a professional software developer I can tell you that the devil is in the details. One example is how important security is nowadays, and how challenging it can be to maintain a proper security posture even when you know how to code. Besides, with the right software development partner you'll end up with a better result, and stay within your timeline and budget.

A professional software development partner can handle all of the gaps and requirements you may have identified in the previous lists, including:

  • Tight timeline
  • High complexity
  • Security concerns
  • Reliability concerns
  • Scalability concerns
  • Deployment, hosting, and/or third party integrations
  • Changes to support hosting or third party integration changes
  • Technology options
  • Product evolution and upgrades
  • Strategies for disaster recovery and data backup
  • Strategies for scaling the product
  • App look and feel
  • ...and so much more!

Want to know more?

There's usually more to the story so if you have questions or comments about this post let us know!

Do you need a new software development partner for an upcoming project? We would love to work with you! From websites and mobile apps to cloud services and custom software, we can help!

Ollama Farm

Michael Argentini Avatar
Michael ArgentiniTuesday, September 3, 2024

Ollama Farm is a CLI tool that intermediates REST API calls to multiple ollama API services. Simply make calls to the Ollama Farm REST API as if it were an ollama REST API and the rest is handled for you.

Installation

Install dotnet 8 or later from https://dotnet.microsoft.com/en-us/download and then install Ollama Farm with the following command:

dotnet tool install --global fynydd.ollamafarm

You should relaunch Terminal/cmd/PowerShell so that the system path will be reloaded and the ollamafarm command can be found. If you've previously installed the dotnet runtime, this won't be necessary.

You can update to the latest version using the command below.

dotnet tool update --global fynydd.ollamafarm

You can remove the tool from your system using the command below.

dotnet tool uninstall --global fynydd.ollamafarm

Usage

Ollama Farm is a system-level command line interface application (CLI). After installing you can access Ollama Farm at any time.

To get help on the available commands, just run ollamafarm in Terminal, cmd, or PowerShell. This will launch the application in help mode which displays the commands and options.

For example, you can launch Ollama Farm with one or more host addresses to include in the farm:

ollamafarm localhost 192.168.0.5 192.168.0.6

In this example, Ollama Farm will listen on port 4444 for requests to /api/generate. The requests are standard Ollama API REST requests: HTTP POST with a JSON payload. Requests will get sent to the first available host in the farm.

You can also change the default Ollama Farm listening port of 4444:

ollamafarm --port 5555 localhost 192.168.0.5 192.168.0.6

And if you run any ollama hosts on a port other than 11434, just specify the port in the host names using colon syntax:

ollamafarm --port 5555 localhost:12345 192.168.0.5 192.168.0.6

Ollama Farm requests

Requests made to the Ollama Farm service will be routed to one of the available Ollama API hosts in the farm. Requests should be sent to this service (default port 4444) following the standard Ollama JSON request format (HTTP POST to /api/generate/). Streaming is supported.

Hosts are checked periodically and are taken offline when they are unavailable. They are also brought back online when they become available.

To optimize performance Ollama Farm restricts each host to processing one request at a time. When all hosts are busy REST calls return status code 429 (too many requests). This allows requesters to poll until a resource is available.

Additional properties

  • farm_host : Request a specific host (e.g. localhost:11434)
  • farm_host : Identify the host used

Example:

{
    "farm_host": "localhost",
    "model": ...
}
Screenshots

Want to know more?

There's usually more to the story so if you have questions or comments about this post let us know!

Do you need a new software development partner for an upcoming project? We would love to work with you! From websites and mobile apps to cloud services and custom software, we can help!

Run the Ollama API on macOS with custom host bindings

Michael Argentini Avatar
Michael ArgentiniSunday, July 28, 2024

By default the ollama API runs on the localhost IP address of 127.0.0.1. If you want to host it on all of your Mac's IP addresses it requires that you set a system-wide environment variable. The problem with doing this is that Login Items (in System Settings) can launch before Launch Agents. This means that Ollama (in the menu bar) may not see the host settings. To solve this you need to launch Ollama at startup after a delay.

Here's how to add the host binding for all IP addresses on the Mac and then have Ollama launch 10 seconds after you sign in. This works in macOS 14.5 Sonoma and should work in later versions.

Step 1: Create a launch daemon plist file below. Save it as com.fynydd.ollama.plist.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.fynydd.ollama</string>
    <key>ProgramArguments</key>
    <array>
        <string>/bin/launchctl</string>
        <string>setenv</string>
        <string>OLLAMA_HOST</string>
        <string>0.0.0.0</string>
    </array>
    <key>RunAtLoad</key>
    <true/>
    <key>LaunchOnlyOnce</key>
    <true/>
</dict>
</plist>

Step 2: Copy the file to two locations:

/Library/LaunchDaemons/
~/Library/LaunchAgents/

This will set the host bindings at the system level, and also at the user level. So you should be covered no matter how you launch Ollama in the future.

Step 3: Set file permissions on the system-level file.

sudo chown root:wheel /Library/LaunchDaemons/com.fynydd.ollama.plist
sudo chmod 644 /Library/LaunchDaemons/com.fynydd.ollama.plist

Step 4: Install the launch agents:

sudo launchctl bootstrap system /Library/LaunchDaemons/com.fynydd.ollama.plist
launchctl load ~/Library/LaunchAgents/com.fynydd.ollama.plist

Now your system will start up and bind the Ollama host address to all IP addresses on the Mac.

Step 5: To launch Ollama after a 10 second delay, Open Script Editor and create the simple AppleScript file below.

delay 10
tell application "Ollama" to run

In the File menu choose Export, and then export it as type “Application” and name it “LaunchOllamaDelay”. Save it to your user Applications folder.

In System Settings go to Login Items and add the LaunchOllamaDelay application to the startup items. Also remove any existing Ollama startup item.

Now when you restart and sign in, Ollama will launch after 10 seconds which should be enough time for the Launch Agent to have executed. And if Ollama updates itself in the future it should also just work when it restarts.

Want to know more?

There's usually more to the story so if you have questions or comments about this post let us know!

Do you need a new software development partner for an upcoming project? We would love to work with you! From websites and mobile apps to cloud services and custom software, we can help!

© 2026, Fynydd LLC / King of Prussia, Pennsylvania; United States / +1 855-439-6933

By using this website you accept our privacy policy. Choose the browser data you consent to allow:

Only Required
Accept and Close