{"id":5658,"date":"2026-04-22T21:59:09","date_gmt":"2026-04-22T20:59:09","guid":{"rendered":"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/"},"modified":"2026-04-22T22:01:28","modified_gmt":"2026-04-22T21:01:28","slug":"googles-gemma-4-shines-on-local-systems-both-big-and-small","status":"publish","type":"post","link":"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/","title":{"rendered":"Google\u2019s Gemma 4 shines on local systems \u2013 both big and small"},"content":{"rendered":"<div class=\"anp-pro-entry\">\n<p class=\"anp-pro-lead\">The topic <strong>Google\u2019s Gemma 4 shines on local systems \u2013 both big and small<\/strong> is currently the subject of lively discussion \u2014 readers and analysts are keeping a close eye on developments.<\/p>\n<p class=\"anp-pro-p\">This is taking place in a dynamic environment: companies\u2019 decisions and competitors\u2019 reactions can quickly change the picture.<\/p>\n<p class=\"anp-pro-p\">Google\u2019s Gemma 4 comes touted as the latest evolution of Google\u2019s multi-modal model offerings. Gemma 4 not only offers reasoning and tool use, but vision and audio functionality, and it\u2019s available in a range of model sizes that target servers and local devices.<\/p>\n<p class=\"anp-pro-p\">What\u2019s striking about Gemma 4 is that even at the higher end of its size range, it\u2019s still decently performant on personal hardware. Google claims this is due to innovations in the architecture of the model, but the proof is in the trying. Gemma 4 is quite responsive.<\/p>\n<p class=\"anp-pro-p\">To that end, I took Gemma 4 for a spin on my own hardware to see how it fared for its advertised tasks.<\/p>\n<figure class=\"anp-pro-inline-figure\" style=\"margin:1.75em auto;text-align:center;max-width:100%\"><img decoding=\"async\" class=\"anp-pro-inline-img\" src=\"https:\/\/innovatenews.site\/wp-content\/uploads\/2026\/04\/image_31.png\" alt=\"\" style=\"margin:0 auto;max-width:100%;width:auto;height:auto;object-fit:contain;object-position:center\" loading=\"lazy\"><\/figure>\n<p class=\"anp-pro-p\">Each of these model sizes is available in a slew of community-created editions, thanks to Gemma 4\u2019s Apache 2 licensing. For instance, the 26B A4B model comes in a community edition with more compact quantizations (4-bit, 6-bit, etc.), which I used as one of the model mixes for this article.<\/p>\n<p class=\"anp-pro-p\">I ran each model using my now-standard test bed: LM Studio 0.4.10 on an AMD Ryzen 5 3600 6-core CPU (32GB RAM) and an Nvidia GeForce RTX 5060 (8GB VRAM).<\/p>\n<p class=\"anp-pro-p\">The 26B model was at the upper end of what I could run comfortably on my test hardware. I wasn\u2019t able to fit the entire model into GPU memory, but I set the first 12 layers to run on the GPU (7.51GB VRAM), and I set the context length to 16384 tokens (total: 18.76GB RAM).<\/p>\n<p class=\"anp-pro-p\">Getting good performance out of models that don\u2019t fit in VRAM is always a challenge. However, Gemma 4 has, courtesy of its \u201cmixture of experts\u201d design, a feature to boost performance. LM Studio exposes this feature through a setting currently tagged as experimental. You can choose how many layers of the model to \u201cforce MoE [Mixture of Experts] weights onto the CPU,\u201d which conserves VRAM and can speed up inference.<\/p>\n<p class=\"anp-pro-p\">The MoE (mixture of experts) experimental setting in LM Studio. For models that use an MoE design, this setting forces the weights for that aspect of the model to be run on the CPU instead of the GPU. With Gemma 4, this resulted in a major speed boost for models too big to fit in memory.<\/p>\n<p class=\"anp-pro-p\">Without the MoE forcing, the overall inference time and token generation speed cratered; the model could barely manage an average of 1.5 tokens per second even for simple queries. With MoE forcing turned on (with the maximum number of layers supported, 30), token generation speed jumped to anywhere from 5 to 13 tokens per second, depending on the rest of the system\u2019s load. That\u2019s still a far cry from the speed of the smaller models, but a lot more workable.<\/p>\n<figure class=\"anp-pro-inline-figure\" style=\"margin:1.75em auto;text-align:center;max-width:100%\"><img decoding=\"async\" class=\"anp-pro-inline-img\" src=\"https:\/\/innovatenews.site\/wp-content\/uploads\/2026\/04\/image_25.png\" alt=\"\" style=\"margin:0 auto;max-width:100%;width:auto;height:auto;object-fit:contain;object-position:center\" loading=\"lazy\"><\/figure>\n<p class=\"anp-pro-p\">For faster time-to-first-token results, you can disable thinking, at the possible cost of less robust output. For the code-generation query, Gemma 4 spent 6 minutes 26 seconds thinking, and over 8 minutes generating the response (5,013 tokens, 9.55 tokens per second). The resulting code and explanation was not significantly more advanced or detailed than the non-thinking version.<\/p>\n<p class=\"anp-pro-p\">Response from Gemma 4\u2019s 26B parameter model to a query to generate code. This larger version of the model runs less quickly when it can\u2019t fit entirely in memory, but its mixture-of-experts design helped offset that limitation.<\/p>\n<p class=\"anp-pro-p\">When I switched to the LM Studio Community edition of the E4B model, I put all 42 layers on the GPU and kept the context at 16,384, all of which fit comfortably in VRAM with room to spare. The results were a major jump in speed: 72 tokens per second. The smaller model was less specific for certain queries \u2014 the code-generation query in particular didn\u2019t generate a comprehensive code example, only a conceptual framework for one \u2014 but still did a decent job of analyzing the problem and suggesting constructive approaches. The \u201cunsloth\u201d edition of the E4B model, despite being slightly smaller, was about as performant and useful.<\/p>\n<p class=\"anp-pro-p\">Examples of Gemma 4\u2019s 26B parameter version generating image captions. The smaller versions of the model tended not to editorialize. The larger version sometimes needed specific guidance to be less verbose or florid.<\/p>\n<p class=\"anp-pro-p\">For the \u201cmake this program more modular\u201d prompt, I got roughly equivalent results across all incarnations of the model in terms of the advice given. The only major difference was that the smaller models ran far faster \u2014 73.85 and 71.73 tokens per second vs. 9.3 for the big model.<\/p>\n<p class=\"anp-pro-p\">The biggest takeaway from running Gemma 4 locally is how the mix-of-experts design in one of the larger incarnations of the model make it useful even on systems where the model doesn\u2019t fit entirely into VRAM. The smaller incarnations of the model, even at lower quantizations, still work well, too. They also deliver results many times faster, and free up much more memory for larger context windows. Thus, the smaller models are well worth experimenting with as the first model of choice before moving up to their bigger brothers.<\/p>\n<aside class=\"anp-pro-aside\" aria-label=\"context\">\n<p class=\"anp-pro-kicker\">Why it matters<\/p>\n<p class=\"anp-pro-p\">News like this often changes audience expectations and competitors\u2019 plans.<\/p>\n<p class=\"anp-pro-p\">When one player makes a move, others usually react \u2014 it is worth reading the event in context.<\/p>\n<\/aside>\n<aside class=\"anp-pro-aside\" aria-label=\"outlook\">\n<p class=\"anp-pro-kicker\">What to look out for next<\/p>\n<p class=\"anp-pro-p\">The full picture will become clear in time, but the headline already shows the dynamics of the industry.<\/p>\n<p class=\"anp-pro-p\">Further statements and user reactions will add to the story.<\/p>\n<\/aside>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>The topic Google\u2019s Gemma 4 shines on local systems \u2013 both big and small is currently &hellip; <a title=\"Google\u2019s Gemma 4 shines on local systems \u2013 both big and small\" class=\"hm-read-more\" href=\"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/\"><span class=\"screen-reader-text\">Google\u2019s Gemma 4 shines on local systems \u2013 both big and small<\/span>Read more<\/a><\/p>\n","protected":false},"author":0,"featured_media":5659,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[1211,245,1249,1248,803],"class_list":["post-5658","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-innovate","tag-gemma","tag-model","tag-smaller","tag-tokens","tag-vram"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.2 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Google\u2019s Gemma 4 shines on local systems \u2013 both big and small - innovatenews.site<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Google\u2019s Gemma 4 shines on local systems \u2013 both big and small - innovatenews.site\" \/>\n<meta property=\"og:description\" content=\"The topic Google\u2019s Gemma 4 shines on local systems \u2013 both big and small is currently &hellip; Google\u2019s Gemma 4 shines on local systems \u2013 both big and smallRead more\" \/>\n<meta property=\"og:url\" content=\"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/\" \/>\n<meta property=\"og:site_name\" content=\"innovatenews.site\" \/>\n<meta property=\"article:published_time\" content=\"2026-04-22T20:59:09+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-04-22T21:01:28+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/innovatenews.site\/wp-content\/uploads\/2026\/04\/4156597-0-94313700-1776848538-shutterstock_2556469215-scaled.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"2560\" \/>\n\t<meta property=\"og:image:height\" content=\"1440\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/\"},\"author\":{\"name\":\"\",\"@id\":\"\"},\"headline\":\"Google\u2019s Gemma 4 shines on local systems \u2013 both big and small\",\"datePublished\":\"2026-04-22T20:59:09+00:00\",\"dateModified\":\"2026-04-22T21:01:28+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/\"},\"wordCount\":934,\"image\":{\"@id\":\"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/innovatenews.site\/wp-content\/uploads\/2026\/04\/4156597-0-94313700-1776848538-shutterstock_2556469215-scaled.jpg\",\"keywords\":[\"Gemma\",\"Model\",\"Smaller\",\"Tokens\",\"Vram\"],\"articleSection\":[\"Innovate\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/\",\"url\":\"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/\",\"name\":\"Google\u2019s Gemma 4 shines on local systems \u2013 both big and small - innovatenews.site\",\"isPartOf\":{\"@id\":\"https:\/\/innovatenews.site\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/innovatenews.site\/wp-content\/uploads\/2026\/04\/4156597-0-94313700-1776848538-shutterstock_2556469215-scaled.jpg\",\"datePublished\":\"2026-04-22T20:59:09+00:00\",\"dateModified\":\"2026-04-22T21:01:28+00:00\",\"author\":{\"@id\":\"\"},\"breadcrumb\":{\"@id\":\"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/#primaryimage\",\"url\":\"https:\/\/innovatenews.site\/wp-content\/uploads\/2026\/04\/4156597-0-94313700-1776848538-shutterstock_2556469215-scaled.jpg\",\"contentUrl\":\"https:\/\/innovatenews.site\/wp-content\/uploads\/2026\/04\/4156597-0-94313700-1776848538-shutterstock_2556469215-scaled.jpg\",\"width\":2560,\"height\":1440},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/innovatenews.site\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Google\u2019s Gemma 4 shines on local systems \u2013 both big and small\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/innovatenews.site\/#website\",\"url\":\"https:\/\/innovatenews.site\/\",\"name\":\"innovatenews.site\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/innovatenews.site\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Google\u2019s Gemma 4 shines on local systems \u2013 both big and small - innovatenews.site","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/","og_locale":"en_US","og_type":"article","og_title":"Google\u2019s Gemma 4 shines on local systems \u2013 both big and small - innovatenews.site","og_description":"The topic Google\u2019s Gemma 4 shines on local systems \u2013 both big and small is currently &hellip; Google\u2019s Gemma 4 shines on local systems \u2013 both big and smallRead more","og_url":"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/","og_site_name":"innovatenews.site","article_published_time":"2026-04-22T20:59:09+00:00","article_modified_time":"2026-04-22T21:01:28+00:00","og_image":[{"width":2560,"height":1440,"url":"https:\/\/innovatenews.site\/wp-content\/uploads\/2026\/04\/4156597-0-94313700-1776848538-shutterstock_2556469215-scaled.jpg","type":"image\/jpeg"}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/#article","isPartOf":{"@id":"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/"},"author":{"name":"","@id":""},"headline":"Google\u2019s Gemma 4 shines on local systems \u2013 both big and small","datePublished":"2026-04-22T20:59:09+00:00","dateModified":"2026-04-22T21:01:28+00:00","mainEntityOfPage":{"@id":"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/"},"wordCount":934,"image":{"@id":"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/#primaryimage"},"thumbnailUrl":"https:\/\/innovatenews.site\/wp-content\/uploads\/2026\/04\/4156597-0-94313700-1776848538-shutterstock_2556469215-scaled.jpg","keywords":["Gemma","Model","Smaller","Tokens","Vram"],"articleSection":["Innovate"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/","url":"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/","name":"Google\u2019s Gemma 4 shines on local systems \u2013 both big and small - innovatenews.site","isPartOf":{"@id":"https:\/\/innovatenews.site\/#website"},"primaryImageOfPage":{"@id":"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/#primaryimage"},"image":{"@id":"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/#primaryimage"},"thumbnailUrl":"https:\/\/innovatenews.site\/wp-content\/uploads\/2026\/04\/4156597-0-94313700-1776848538-shutterstock_2556469215-scaled.jpg","datePublished":"2026-04-22T20:59:09+00:00","dateModified":"2026-04-22T21:01:28+00:00","author":{"@id":""},"breadcrumb":{"@id":"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/#primaryimage","url":"https:\/\/innovatenews.site\/wp-content\/uploads\/2026\/04\/4156597-0-94313700-1776848538-shutterstock_2556469215-scaled.jpg","contentUrl":"https:\/\/innovatenews.site\/wp-content\/uploads\/2026\/04\/4156597-0-94313700-1776848538-shutterstock_2556469215-scaled.jpg","width":2560,"height":1440},{"@type":"BreadcrumbList","@id":"https:\/\/innovatenews.site\/index.php\/2026\/04\/22\/googles-gemma-4-shines-on-local-systems-both-big-and-small\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/innovatenews.site\/"},{"@type":"ListItem","position":2,"name":"Google\u2019s Gemma 4 shines on local systems \u2013 both big and small"}]},{"@type":"WebSite","@id":"https:\/\/innovatenews.site\/#website","url":"https:\/\/innovatenews.site\/","name":"innovatenews.site","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/innovatenews.site\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"}]}},"_links":{"self":[{"href":"https:\/\/innovatenews.site\/index.php\/wp-json\/wp\/v2\/posts\/5658","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/innovatenews.site\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/innovatenews.site\/index.php\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/innovatenews.site\/index.php\/wp-json\/wp\/v2\/comments?post=5658"}],"version-history":[{"count":1,"href":"https:\/\/innovatenews.site\/index.php\/wp-json\/wp\/v2\/posts\/5658\/revisions"}],"predecessor-version":[{"id":5663,"href":"https:\/\/innovatenews.site\/index.php\/wp-json\/wp\/v2\/posts\/5658\/revisions\/5663"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/innovatenews.site\/index.php\/wp-json\/wp\/v2\/media\/5659"}],"wp:attachment":[{"href":"https:\/\/innovatenews.site\/index.php\/wp-json\/wp\/v2\/media?parent=5658"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/innovatenews.site\/index.php\/wp-json\/wp\/v2\/categories?post=5658"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/innovatenews.site\/index.php\/wp-json\/wp\/v2\/tags?post=5658"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}