<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Idea Ignition: Fueling Startups from Concept to Cloud]]></title><description><![CDATA[Transforming visions into realities - how to scale engineering functions and create products that last.]]></description><link>https://articles.concepttocloud.com</link><image><url>https://substackcdn.com/image/fetch/$s_!2EKk!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F162952e6-d461-4416-b5da-64329d4c29fb_800x800.png</url><title>Idea Ignition: Fueling Startups from Concept to Cloud</title><link>https://articles.concepttocloud.com</link></image><generator>Substack</generator><lastBuildDate>Wed, 03 Jun 2026 05:45:49 GMT</lastBuildDate><atom:link href="https://articles.concepttocloud.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Tom Barber]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[spicule@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[spicule@substack.com]]></itunes:email><itunes:name><![CDATA[Tom Barber]]></itunes:name></itunes:owner><itunes:author><![CDATA[Tom Barber]]></itunes:author><googleplay:owner><![CDATA[spicule@substack.com]]></googleplay:owner><googleplay:email><![CDATA[spicule@substack.com]]></googleplay:email><googleplay:author><![CDATA[Tom Barber]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Mondrian-4 YAML Schemas: Write Less, Ship More]]></title><description><![CDATA[Mondrian&#8217;s XML schema format is one of the most expressive schema languages in open-source BI.]]></description><link>https://articles.concepttocloud.com/p/mondrian-4-yaml-schemas-write-less</link><guid isPermaLink="false">https://articles.concepttocloud.com/p/mondrian-4-yaml-schemas-write-less</guid><dc:creator><![CDATA[Tom Barber]]></dc:creator><pubDate>Wed, 03 Jun 2026 02:17:10 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!2EKk!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F162952e6-d461-4416-b5da-64329d4c29fb_800x800.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Mondrian&#8217;s XML schema format is one of the most expressive schema languages in open-source BI. It is also one of the noisiest to write by hand. Every attribute is quoted, every element is closed, and every relationship lives inside a deeply nested angle-bracket tree that is completely invisible in a git diff. A single typo in a namespace declaration can prevent the engine from loading at all, and the error message tells you approximately nothing.</p><p>What if you could write the same schema in YAML?</p><p>As of mondrian-saiku 4.8.1.11 (PR #34), you can.</p><h1>What shipped</h1><p>Three things landed together:</p><h2>YAML as a first-class catalog format.</h2><p>RolapSchemaLoader now auto-detects YAML content by inspecting the catalog string &#8212; if it begins with schema: (after any whitespace, comments, or --- document markers) it is routed through the converter before the rest of the pipeline sees it. No configuration flag, no file-extension check, no extra properties. Point your Catalog URL at a .yaml file and Mondrian loads it.</p><h2>A bidirectional converter. </h2><p>The new M4YamlToXml / M4XmlToYaml pair converts in both directions with symmetric fidelity. The canonical representation stays XML (that is what XOM, the Mondrian XML object model, serializes); YAML is translated to a typed MondrianDef object graph first, so the converter is not fragile string-munging &#8212; it is the same structured pipeline Mondrian uses to load schemas from disk.</p><h2>A mondrian-schema CLI. </h2><p>A single shell script wraps the converter and a linter:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;c6a5209f-2c04-4403-a669-1a4fd8ebdf63&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash"># Convert an existing Mondrian-4 XML schema to YAML
./scripts/mondrian-schema to-yaml demo/FoodMart.mondrian.xml -o demo/FoodMart.yaml
# Convert back (lossless round-trip)
./scripts/mondrian-schema to-xml demo/FoodMart.yaml -o FoodMart-roundtrip.xml
# Lint a schema before deploying (validates structure; exits 2 on failure)
./scripts/mondrian-schema lint demo/FoodMart</code></pre></div><p>First invocation compiles and caches the classpath (target/dependency/classpath.txt). Subsequent runs are fast &#8212; around 130 ms.</p><p>Mondrian-3 support is unchanged. The M3 YAML format (shared dimensions as &lt;DimensionUsage&gt;, &lt;Join&gt;, &lt;Level&gt;, etc.) continues to work. M3 and M4 YAML are mutually exclusive and detected by the presence of the `physical_schema:` key.</p><p>---</p><h2>Show, don&#8217;t tell</h2><p>Here is the Store shared dimension from the FoodMart schema &#8212; first as Mondrian-4 XML, then as its YAML equivalent.</p><h3>XML (51 lines)</h3><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;xml&quot;,&quot;nodeId&quot;:&quot;8e254751-df00-4a57-b0bb-cdb079f68d87&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-xml">&lt;Dimension name=&#8217;Store&#8217; table=&#8217;store&#8217; key=&#8217;Store Id&#8217;&gt;

    &lt;Attributes&gt;

        &lt;Attribute name=&#8217;Store Country&#8217; hasHierarchy=&#8217;false&#8217;&gt;

            &lt;Key&gt;

                &lt;Column name=&#8217;store_country&#8217;/&gt;

            &lt;/Key&gt;

        &lt;/Attribute&gt;

        &lt;Attribute name=&#8217;Store State&#8217; keyColumn=&#8217;store_state&#8217; hasHierarchy=&#8217;false&#8217;/&gt;

        &lt;Attribute name=&#8217;Store City&#8217; hasHierarchy=&#8217;false&#8217;&gt;

            &lt;Key&gt;

                &lt;Column name=&#8217;store_state&#8217;/&gt;

                &lt;Column name=&#8217;store_city&#8217;/&gt;

            &lt;/Key&gt;

            &lt;Name&gt;

                &lt;Column name=&#8217;store_city&#8217;/&gt;

            &lt;/Name&gt;

        &lt;/Attribute&gt;

        &lt;Attribute name=&#8217;Store Id&#8217; keyColumn=&#8217;store_id&#8217; hasHierarchy=&#8217;false&#8217;/&gt;

        &lt;Attribute name=&#8217;Store Name&#8217; keyColumn=&#8217;store_name&#8217; hasHierarchy=&#8217;false&#8217;&gt;

            &lt;Property attribute=&#8217;Store Type&#8217;/&gt;

            &lt;Property attribute=&#8217;Store Manager&#8217;/&gt;

            &lt;Property attribute=&#8217;Store Sqft&#8217;/&gt;

            &lt;Property attribute=&#8217;Grocery Sqft&#8217;/&gt;

            &lt;Property attribute=&#8217;Frozen Sqft&#8217;/&gt;

            &lt;Property attribute=&#8217;Meat Sqft&#8217;/&gt;

            &lt;Property attribute=&#8217;Has coffee bar&#8217;/&gt;

            &lt;Property attribute=&#8217;Street address&#8217;/&gt;

        &lt;/Attribute&gt;

        &lt;Attribute name=&#8217;Store Type&#8217; keyColumn=&#8217;store_type&#8217;

                   hierarchyAllMemberName=&#8217;All Store Types&#8217;/&gt;

        &lt;Attribute name=&#8217;Store Manager&#8217; keyColumn=&#8217;store_manager&#8217; hasHierarchy=&#8217;false&#8217;/&gt;

        &lt;Attribute name=&#8217;Store Sqft&#8217; keyColumn=&#8217;store_sqft&#8217; hasHierarchy=&#8217;false&#8217;/&gt;

        &lt;Attribute name=&#8217;Grocery Sqft&#8217; keyColumn=&#8217;grocery_sqft&#8217; hasHierarchy=&#8217;false&#8217;/&gt;

        &lt;Attribute name=&#8217;Frozen Sqft&#8217; keyColumn=&#8217;frozen_sqft&#8217; hasHierarchy=&#8217;false&#8217;/&gt;

        &lt;Attribute name=&#8217;Meat Sqft&#8217; keyColumn=&#8217;meat_sqft&#8217; hasHierarchy=&#8217;false&#8217;/&gt;

        &lt;Attribute name=&#8217;Has coffee bar&#8217; keyColumn=&#8217;coffee_bar&#8217; hasHierarchy=&#8217;false&#8217;/&gt;

        &lt;Attribute name=&#8217;Street address&#8217; keyColumn=&#8217;store_street_address&#8217; hasHierarchy=&#8217;false&#8217;/&gt;

    &lt;/Attributes&gt;

    &lt;Hierarchies&gt;

        &lt;Hierarchy name=&#8217;Stores&#8217; allMemberName=&#8217;All Stores&#8217;&gt;

            &lt;Level attribute=&#8217;Store Country&#8217;/&gt;

            &lt;Level attribute=&#8217;Store State&#8217;/&gt;

            &lt;Level attribute=&#8217;Store City&#8217;/&gt;

            &lt;Level attribute=&#8217;Store Name&#8217;/&gt;

        &lt;/Hierarchy&gt;

        &lt;Hierarchy name=&#8217;Store Size in SQFT&#8217;&gt;

            &lt;Level attribute=&#8217;Store Sqft&#8217;/&gt;

        &lt;/Hierarchy&gt;

    &lt;/Hierarchies&gt;

&lt;/Dimension&gt;</code></pre></div><p></p><h3>YAML (38 lines, from demo/FoodMart.yaml)</h3><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;63d63b41-eeef-4d2c-8582-de0b3f0815b9&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">shared_dimensions:

  Store:

    table: &#8220;store&#8221;

    key: &#8220;Store Id&#8221;

    attributes:

    - name: &#8220;Store Country&#8221;

      key: [&#8221;store_country&#8221;]

      has_hierarchy: false

    - name: &#8220;Store State&#8221;

      key_column: &#8220;store_state&#8221;

      has_hierarchy: false

    - name: &#8220;Store City&#8221;

      key: [&#8221;store_state&#8221;, &#8220;store_city&#8221;]

      name_column: &#8220;store_city&#8221;

      has_hierarchy: false

    - name: &#8220;Store Id&#8221;

      key_column: &#8220;store_id&#8221;

      has_hierarchy: false

    - name: &#8220;Store Name&#8221;

      key_column: &#8220;store_name&#8221;

      has_hierarchy: false

      properties:

      - &#8220;Store Type&#8221;

      - &#8220;Store Manager&#8221;

      - &#8220;Store Sqft&#8221;

      - &#8220;Grocery Sqft&#8221;

      - &#8220;Frozen Sqft&#8221;

      - &#8220;Meat Sqft&#8221;

      - &#8220;Has coffee bar&#8221;

      - &#8220;Street address&#8221;

    - name: &#8220;Store Type&#8221;

      key_column: &#8220;store_type&#8221;

      hierarchy_all_member_name: &#8220;All Store Types&#8221;

    hierarchies:

    - name: &#8220;Stores&#8221;

      all_member_name: &#8220;All Stores&#8221;

      levels: [&#8221;Store Country&#8221;, &#8220;Store State&#8221;, &#8220;Store City&#8221;, &#8220;Store Name&#8221;]

    - name: &#8220;Store Size in SQFT&#8221;

      levels: [&#8221;Store Sqft&#8221;]</code></pre></div><p>The information content is identical. What disappeared: 79 closing tags, 34 attribute-quotation pairs, the &lt;Key&gt;&lt;Column .../&gt;&lt;/Key&gt; ceremony for multi-column keys (replaced by a plain list), and the &lt;Level attribute=&#8217;...&#8217;/&gt; dance for hierarchy levels (replaced by a bare list of strings). What remained: every structural relationship, every column reference, every property association.</p><div><hr></div><h2>How this makes your life easier</h2><p>Migrate in one command. If you have an existing Mondrian-4 XML schema, converting it is a single CLI call:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;733cf72c-c0b8-4534-864d-4f095e29329c&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">./scripts/mondrian-schema to-yaml your-schema.xml -o your-schema.yaml</code></pre></div><p>The output is immediately loadable &#8212; Mondrian detects the YAML format on next startup with no other changes.</p><p>Diff and code-review schemas like real code. XML diffs are dominated by closing-tag churn and indentation noise. YAML diffs show only the structural change. A hierarchy level addition is one line; adding a property to an attribute is one line. Pull-request reviews for schema changes become tractable.</p><p>Lint before you deploy. The lint subcommand validates the schema&#8217;s structure and reports errors with a non-zero exit code, which makes it trivially hookable into CI:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;a349c38d-0697-40ce-bbeb-650bd85f9d54&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">./scripts/mondrian-schema lint your-schema.yaml

# exit 0 &#8594; clean

# exit 2 &#8594; diagnostic printed to stderr, CI fails</code></pre></div><p>**Split large schemas across files.** For schemas that have grown unwieldy, YAML&#8217;s $ref include mechanism lets you break one big file into logical pieces:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;9aa418f5-29cb-437b-a32b-9340696f9e04&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml"># top-level schema file

schema: &#8220;MyWarehouse&#8221;

shared_dimensions:

  $ref: &#8220;dimensions/store.yaml&#8221;

cubes:

  $ref: &#8220;cubes/sales.yaml&#8221;</code></pre></div><p>$ref` resolution works when the schema is loaded via a file:// Catalog URL; it is skipped when content is passed inline.</p><p>Keep schemas in version control readably. Because YAML is plain text with clean diff behaviour, you can commit your schema next to your dbt models or Liquibase migrations and treat schema changes as first-class tracked changes rather than opaque XML blobs.</p><p>---</p><h2>How it works</h2><p>The architecture is straightforward and avoids string-munging entirely.</p><p>When RolapSchemaLoader reads a catalog and detects YAML content, it calls YamlSchemaConverter.toXml(). That method checks for a metamodel_version key to distinguish M4 from M3 YAML, then hands off to M4YamlToXml.</p><p>M4YamlToXml walks the parsed YAML map and builds a typed MondrianDef.Schema object graph &#8212; the same Java types that RolapSchemaLoader normally constructs from XOM-parsed XML. Once the object graph is built, XOM serializes it to XML, and from that point onward the loading pipeline is identical to reading an XML file directly.</p><p>The reverse path `M4XmlToYaml) uses M4CubeIngester to walk the MondrianDef graph and emit a structured YAML document. Neither direction manipulates XML as text; both operate on the typed object model. This means the converter inherits Mondrian&#8217;s own structural guarantees &#8212; it cannot produce XML that does not correspond to a valid MondrianDef tree.</p><div><hr></div><h2>Try it now</h2><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;1ca120d0-f4ac-4d48-98b5-d2ebab373b1b&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash"># 1. Convert your Mondrian-4 XML schema to YAML

./scripts/mondrian-schema to-yaml path/to/your-schema.xml -o your-schema.yaml

# 2. Inspect and edit the YAML

# 3. Lint it

./scripts/mondrian-schema lint your-schema.yaml

# 4. Point Mondrian at the YAML file

#    (no other changes required &#8212; Mondrian auto-detects YAML by content)

jdbc:mondrian:Jdbc=jdbc:...;Catalog=file:///path/to/your-schema.yaml;...</code></pre></div><p>If you prefer to keep XML as the source of truth, you can use YAML purely for authoring and commit the generated XML, or do the opposite &#8212; author in XML and generate YAML for review. Both directions are stable.</p><div><hr></div><h2>Known limitations and what is next</h2><p>The converter does not yet handle every Mondrian-4 feature:</p><p>- `CopyLink` does not preserve the `attribute` field. Round-tripping a `CopyLink` with a non-default attribute loses that field.</p><p>- `DimensionGrant` in roles is not supported. Role definitions support `SchemaGrant` and `CubeGrant` but not dimension-level grants.</p><p>- Some `CalculatedMember` fields are deferred. `caption`, `description`, `visible`, and `CellFormatter` are not yet captured.</p><p>- Some `MeasureGroup` fields are deferred. `approxRowCount` and `ignoreUnrelatedDimensions` are not emitted.</p><p>- Identifiers with embedded dots do not round-trip. The `table.column` encoding and `{col:}` token both split on the first `.`, which breaks quoted identifiers that contain a literal dot character.</p><p>The remaining gaps are well-understood and tracked. Priority for the next phase is filling in the `CalculatedMember` deferred fields and adding `DimensionGrant` support so that complex role definitions can be authored and reviewed in YAML without loss.</p><div><hr></div><p>If you run into an edge case or have an M3 migration story to share, open an issue or reach out on the Saiku community channels.</p>]]></content:encoded></item><item><title><![CDATA[Rebuilding Saiku]]></title><description><![CDATA[An experiment with AI]]></description><link>https://articles.concepttocloud.com/p/rebuilding-saiku</link><guid isPermaLink="false">https://articles.concepttocloud.com/p/rebuilding-saiku</guid><dc:creator><![CDATA[Tom Barber]]></dc:creator><pubDate>Thu, 14 May 2026 20:07:40 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!sR2t!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d13d71b-54f6-4fa6-9de5-b2a80c1caa70_2884x1736.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Years ago, I used to run a commercial open source OLAP tool called Saiku. It went alongside, and sometimes inside, the Pentaho BI platform and allowed business folks to do drag-and-drop analysis of their data in a time when this type of analytics didn&#8217;t happen a great deal, and when it did, it was almost all proprietary.</p><div class="community-chat" data-attrs="{&quot;url&quot;:&quot;https://open.substack.com/pub/spicule/chat?utm_source=chat_embed&quot;,&quot;subdomain&quot;:&quot;spicule&quot;,&quot;pub&quot;:{&quot;id&quot;:2262293,&quot;name&quot;:&quot;Idea Ignition: Fueling Startups from Concept to Cloud&quot;,&quot;author_name&quot;:&quot;Tom Barber&quot;,&quot;author_photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!iVmO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbca5ce9c-6dd5-4f4b-8504-ac43dc7bc263_2172x1328.png&quot;}}" data-component-name="CommunityChatRenderPlaceholder"></div><p>At the time, we had lots of users, but not many people committing code or money to keep the platform afloat. It worked pretty well for lead gen, but it was also a huge time suck, and even though we had users from Amazon, Samsung, etc., converting them to any model that would support development was impossible.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sR2t!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d13d71b-54f6-4fa6-9de5-b2a80c1caa70_2884x1736.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sR2t!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d13d71b-54f6-4fa6-9de5-b2a80c1caa70_2884x1736.png 424w, https://substackcdn.com/image/fetch/$s_!sR2t!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d13d71b-54f6-4fa6-9de5-b2a80c1caa70_2884x1736.png 848w, https://substackcdn.com/image/fetch/$s_!sR2t!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d13d71b-54f6-4fa6-9de5-b2a80c1caa70_2884x1736.png 1272w, https://substackcdn.com/image/fetch/$s_!sR2t!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d13d71b-54f6-4fa6-9de5-b2a80c1caa70_2884x1736.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sR2t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d13d71b-54f6-4fa6-9de5-b2a80c1caa70_2884x1736.png" width="1456" height="876" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7d13d71b-54f6-4fa6-9de5-b2a80c1caa70_2884x1736.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:876,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:373608,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://articles.concepttocloud.com/i/197751604?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d13d71b-54f6-4fa6-9de5-b2a80c1caa70_2884x1736.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sR2t!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d13d71b-54f6-4fa6-9de5-b2a80c1caa70_2884x1736.png 424w, https://substackcdn.com/image/fetch/$s_!sR2t!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d13d71b-54f6-4fa6-9de5-b2a80c1caa70_2884x1736.png 848w, https://substackcdn.com/image/fetch/$s_!sR2t!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d13d71b-54f6-4fa6-9de5-b2a80c1caa70_2884x1736.png 1272w, https://substackcdn.com/image/fetch/$s_!sR2t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d13d71b-54f6-4fa6-9de5-b2a80c1caa70_2884x1736.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>So after years of being almost broke, I gave up and went to work for NASA, much to the chagrin of the people using the platform, which is ironic when you&#8217;re being blasted for not supporting the platform when others refuse to commit.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://articles.concepttocloud.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://articles.concepttocloud.com/subscribe?"><span>Subscribe now</span></a></p><p>Fast forward to 2026 and the year of the Agentic AI agent, and I thought, &#8220; Hell, I wonder if this thing even compiles. So I set Claude free on it and asked him to get the old version running. To my surprise, about 30 minutes later, he had a running version on my Mac. So then I thought, well, what else could I do with this? So I asked him to update the main dependencies, etc., and write tests to ensure it continued working. All seemed good.</p><p>So then I thought, well, the UI is a bit dated, it&#8217;s still cool and works really well, but there are so many different frameworks available these days, how about we just rebuild it? So I left him rebuilding the UI using Svelte, as it&#8217;s a framework I&#8217;m comfortable with. Once again, a few hours later, and after a few prompts to tweak it, Saiku had a new UI built with the latest in modern web technologies.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fvxI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83f61dd5-7975-4b06-8a9d-58a0afdb4461_2878x1736.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fvxI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83f61dd5-7975-4b06-8a9d-58a0afdb4461_2878x1736.png 424w, https://substackcdn.com/image/fetch/$s_!fvxI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83f61dd5-7975-4b06-8a9d-58a0afdb4461_2878x1736.png 848w, https://substackcdn.com/image/fetch/$s_!fvxI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83f61dd5-7975-4b06-8a9d-58a0afdb4461_2878x1736.png 1272w, https://substackcdn.com/image/fetch/$s_!fvxI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83f61dd5-7975-4b06-8a9d-58a0afdb4461_2878x1736.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fvxI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83f61dd5-7975-4b06-8a9d-58a0afdb4461_2878x1736.png" width="1456" height="878" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/83f61dd5-7975-4b06-8a9d-58a0afdb4461_2878x1736.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:878,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:571597,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://articles.concepttocloud.com/i/197751604?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83f61dd5-7975-4b06-8a9d-58a0afdb4461_2878x1736.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fvxI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83f61dd5-7975-4b06-8a9d-58a0afdb4461_2878x1736.png 424w, https://substackcdn.com/image/fetch/$s_!fvxI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83f61dd5-7975-4b06-8a9d-58a0afdb4461_2878x1736.png 848w, https://substackcdn.com/image/fetch/$s_!fvxI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83f61dd5-7975-4b06-8a9d-58a0afdb4461_2878x1736.png 1272w, https://substackcdn.com/image/fetch/$s_!fvxI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83f61dd5-7975-4b06-8a9d-58a0afdb4461_2878x1736.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>This continued for a bit longer, and then I went deep on what I could do with Mondrian, the OLAP engine that powers Saiku. We use version 4, which has a story of its own, but I also want to update it and improve support for newer databases, so I added Apache Calcite to the query-generating layer and blogged about it here.</p><p>So, of course, this then brings me to now. This happened a few weeks ago, and now I have a couple of weeks of vacation, which we all know really means &#8220;time to finish up those projects we started but never finished&#8221;. And so I decided to release it. It&#8217;s not particularly well tested yet, but most functions work. I would like to do more to it. I&#8217;m not sure yet, but I&#8217;m working on something to make it easier to build out the OLAP cube specs. I&#8217;ve already started with some AI help and support for Cube building. But I&#8217;m sure there is still more to do, some may be AI-based, some just improved support and functionality.</p><p>Of course, the question is: in the year 2026, does OLAP still serve a purpose? Do small or big businesses need something like Saiku? I don&#8217;t know if the answer is. But I figured, having got this far, I should release it, make it accessible and pay a bit of respect to a piece of software that launched my career, and helped get me to where I am today.</p>]]></content:encoded></item><item><title><![CDATA[Swapping Mondrian’s SQL emitter for Apache Calcite]]></title><description><![CDATA[Or: how I replaced 20 years of hand-tuned dialect code with a general-purpose SQL planner, broke performance by 27%, fixed a single ConcurrentHashMap key, recovered 4&#215; faster on one database and 10&#215;]]></description><link>https://articles.concepttocloud.com/p/swapping-mondrians-sql-emitter-for</link><guid isPermaLink="false">https://articles.concepttocloud.com/p/swapping-mondrians-sql-emitter-for</guid><dc:creator><![CDATA[Tom Barber]]></dc:creator><pubDate>Thu, 23 Apr 2026 19:19:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!2EKk!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F162952e6-d461-4416-b5da-64329d4c29fb_800x800.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>Setup: why swap the SQL emitter at all</h2><p>Mondrian is an OLAP engine. Give it an MDX query + a schema definition, it hands you back cell values by generating and executing a bunch of SQL against your warehouse. Its SQL emission layer is hand-coded: a <code>SqlQuery</code> string builder plus ~30 dialect subclasses (<code>PostgreSqlDialect</code>, <code>MySqlDialect</code>, <code>OracleDialect</code>, <code>HsqldbDialect</code>, &#8230;) each with their own quirks around NULL ordering, date literals, pagination, quoting rules, identifier case.</p><p>Apache Calcite does the same job: given a logical plan, emit dialect-correct SQL. But Calcite&#8217;s approach is a relational algebra + cost model + rule-driven optimizer. It comes with 40+ <code>SqlDialect</code> implementations and a planner that can do things like push filters through joins, match materialized views, reorder joins by selectivity.</p><p>The bet: <strong>replace Mondrian&#8217;s hand-rolled SQL builder with Calcite&#8217;s planner + unparser</strong>. Expected wins:</p><ul><li><p>Delete 30+ dialect classes and 20 years of accumulated workarounds.</p></li><li><p>Cost-based MV selection when multiple pre-aggregated tables could serve a query.</p></li><li><p>Easier to add new database support (Calcite already has dialects for everything).</p></li></ul><p>Expected risks:</p><ul><li><p>Calcite&#8217;s cost model needs tuning. A wrong row-count estimate and you&#8217;re scanning 86M rows instead of 86k.</p></li><li><p>Calcite&#8217;s optimizer can produce trees the JDBC unparser can&#8217;t handle if rule selection is loose.</p></li><li><p>Existing deployments depend on Mondrian&#8217;s dialect-specific quirks we don&#8217;t know about.</p></li></ul><p>The plan: four worktrees, merged atomically at the end.</p><ol><li><p><strong>Foundations</strong> &#8212; wire Calcite in behind a kill switch. Default path stays legacy; <code>-Dmondrian.backend=calcite</code> routes through the new code. Establish a cell-set-parity harness.</p></li><li><p><strong>Natives</strong> &#8212; migrate <code>RolapNativeCrossJoin</code>, <code>RolapNativeFilter</code>, <code>RolapNativeTopCount</code>, and <code>DescendantsConstraint</code> to emit Calcite plans. These are the &#8220;native&#8221; tuple-read paths Mondrian uses to avoid materializing large sets in Java.</p></li><li><p><strong>Aggregates + calcs </strong>&#8212; register pre-aggregated tables (declared in schema as <code>&lt;MeasureGroup type=&#8217;aggregate&#8217;&gt;</code>) and let Calcite cost-select them. Push arithmetic calc members down into SQL where possible.</p></li><li><p><strong>Cleanup</strong> &#8212; delete legacy SQL builder, delete 30 dialect classes.</p></li></ol><p>This post covers worktrees 1-3. Cleanup is the final atomic merge.</p><h2>The harness: 45 MDX queries, golden cell-sets, byte-level drift detection</h2><p>Before rewriting anything, I built an equivalence harness. For each of 45 MDX queries:</p><p><code>capture legacy cell-set + sequence of emitted SQL &#8594; golden/</code></p><p><code>run under Calcite backend &#8594; assert cell-set matches golden</code></p><p>Three failure classes:</p><ul><li><p><strong>LEGACY_DRIFT</strong> &#8212; cell values differ from golden. Hard gate. The whole point.</p></li><li><p><strong>SQL_DRIFT</strong> &#8212; cell values match but emitted SQL differs. Advisory. Expected: Calcite writes ANSI <code>INNER JOIN</code>, legacy writes comma-separated <code>FROM</code> + join conditions in <code>WHERE</code>. Postgres plans both identically.</p></li><li><p><strong>PLAN_DRIFT</strong> &#8212; Calcite&#8217;s <code>RelNode</code> differs from a frozen snapshot. Advisory. Review signal for refactors.</p></li></ul><p>The harness stored row-level SHA-256 checksums per JDBC execution. This caught a bug later that no cell-value comparison would have: <strong>identical values returned in different row order also fail checksum</strong>. We&#8217;ll come back to that.</p><h2>First-cut benchmarks: Calcite was 27% slower. Wait, what?</h2><p>After worktree 1 landed, I ran the 2&#215;2 matrix:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;markdown&quot;,&quot;nodeId&quot;:&quot;9ec11783-b8fa-4036-91aa-bef4d4bf07c7&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-markdown">|                     | HSQLDB (87k rows)     | Postgres (86.8M rows) |
|---------------------|----------------------:|----------------------:|
| Legacy Mondrian SQL | 1.13s / query geomean | 7.91s / query geomean |
| Calcite SQL         | 0.24s / query geomean | 9.44s / query geomean |</code></pre></div><p><strong>Calcite was 27% slower on Postgres.</strong> At 1000&#215; scale. With all 44 cell-set assertions passing.</p><p>The MvHit queries (small agg-table scans) were the worst &#8212; 2.24&#215; slower than legacy. That&#8217;s a 2.6k-row agg table taking 5 seconds. Something was eating a second and a half of wall time before any SQL even hit the wire.</p><h2>Y.2: one cache key</h2><p>I instrumented every phase of the Calcite path. Cold iteration on an MvHit query:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;50f60021-5343-4cfc-8e88-ed41416a7a9b&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">plan.total            675 ms

  relBuilderCreate    478 ms    &#8592; here

  build                34 ms

  unparse               6 ms</code></pre></div><p><code>RelBuilder.create</code> doesn&#8217;t take 478 ms. What&#8217;s happening is that the <code>CalciteMondrianSchema</code> adapter wraps <code>JdbcSchema.create(rootSchema, name, ds, null, null)</code>, and Calcite&#8217;s JDBC adapter <strong>reflects the entire database&#8217;s metadata</strong> (every table, every column, every type) via <code>DatabaseMetaData</code> on first touch. On HSQLDB that&#8217;s in-process and instant. On Postgres, that&#8217;s a round-trip to <code>pg_catalog</code> for every table in the schema.</p><p>Fine, so you cache it. The cache was there:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;8181fecf-7a1e-49da-8f4d-ebf3c7028c22&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">// SegmentLoader.java

private static final Map&lt;RolapStar, CalciteSqlPlanner&gt; CACHE = ...;</code></pre></div><p>Keyed on <code>RolapStar</code>. And Mondrian&#8217;s per-query schema-cache flush invalidates the <code>RolapStar</code>. Every query &#8212; a fresh star, cache miss, full JDBC metadata reflection.</p><p>The fix was three lines:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;add4ba82-768b-483a-a9ae-bfec9879e638&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">// CalcitePlannerCache.java

public static CalciteSqlPlanner plannerFor(DataSource ds) {

    Key key = Key.from(ds);  // (url, catalog, schema, user) via DatabaseMetaData

    return CACHE.computeIfAbsent(key, k -&gt; build(ds));

}</code></pre></div><p>Key on JDBC connection identity. The cache persists across Mondrian&#8217;s schema churn because nothing Mondrian does invalidates the JDBC connection&#8217;s identity.</p><p>Result:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;markdown&quot;,&quot;nodeId&quot;:&quot;f000d52c-efec-414c-b7ca-c21ac0b24a86&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-markdown">|                          | HSQLDB| Postgres                      |
|--------------------------|------:|------------------------------:|
| Pre-fix D/B geomean      | &#8212;     | 1.27&#215; (Calcite 27% slower)    |
| **Post-fix D/B geomean** | &#8212;     | **0.93&#215; (Calcite 7% faster)** |
| MvHit D/B                | 1.07&#215; | **1.01&#215;** (was 2.24&#215;)         |
</code></pre></div><p><strong>One cache key moved the needle 36 percentage points.</strong> This was the single biggest perf fix in the entire project. Everything downstream is stacked on top of this.</p><p>Lesson: when a performance delta looks proportional across query sizes, it&#8217;s almost always a fixed per-query cost. Not a plan quality issue. Find the fixed cost first.</p><h2>The 2&#215;2 matrix: Postgres plans both SQLs identically</h2><p>After Y.2, I re-ran the full matrix:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;markdown&quot;,&quot;nodeId&quot;:&quot;6255f61e-edab-47e9-b4c6-70c24b939215&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-markdown">|                                                 | HSQLDB (87k)                      | Postgres (86.8M)    |
|-------------------------------------------------|----------------------------------:|--------------------:|
| **C/A geomean (Calcite vs legacy on HSQLDB)**.  | **0.219&#215;** (Calcite 4.57&#215; faster) | &#8212;                   |
| **D/B geomean (Calcite vs legacy on Postgres)** | &#8212;                                 | **0.958&#215;** (parity) |</code></pre></div><h3>HSQLDB: 4.57&#215; speedup is real but not for the obvious reason</h3><p>Why does Calcite win 4.57&#215; on HSQLDB? Not because Calcite emits &#8220;better&#8221; SQL in any meaningful sense &#8212; Postgres proves that.</p><p>HSQLDB&#8217;s query planner is primitive. Calcite&#8217;s SQL is:</p><ul><li><p>Explicit ANSI INNER JOINs (HSQLDB handles these better than comma joins).</p></li><li><p>Keyword-case consistent (HSQLDB&#8217;s parser has fewer decisions to make).</p></li><li><p>No <code>&#8221;customer&#8221; AS &#8220;customer&#8221;</code> table aliases (HSQLDB allocates fewer scopes).</p></li><li><p>OR-chain IN-lists rewritten to actual <code>IN (v1, v2, v3)</code></p></li></ul><p>Postgres doesn&#8217;t care. Its planner normalizes all of this internally before execution. HSQLDB doesn&#8217;t.</p><p>If your workload is Mondrian + HSQLDB (more common than you&#8217;d think &#8212; it&#8217;s the default for Mondrian demos, plus there are production deployments), Calcite is a meaningful perf improvement <strong>even without any other change</strong>.</p><h3>Postgres: parity, with interesting exceptions</h3><p>Most of the corpus lands at D/B &#8776; 1.0&#215; (within 5% of legacy). The per-query overhead of Calcite&#8217;s translation (build <code>PlannerRequest</code>, construct <code>RelBuilder</code>, unparse) is a few milliseconds on top of a 10s Postgres query &#8212; noise.</p><p>But a handful of queries show 5-12% Calcite wins:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;markdown&quot;,&quot;nodeId&quot;:&quot;7732e425-9ea3-4435-a8d3-b534c50bd4ed&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-markdown">| Query                           | Legacy (s) | Calcite (s) | D/B        |
|---------------------------------|-----------:|------------:|-----------:|
| `named-set`                     | 29.9       | 26.2        | **0.88&#215;**  |
| `topcount`                      | 30.5       | 26.7        | **0.88&#215;**  |
| `native-topcount-product-names` | 32.5       | 29.0        | **0.89&#215;**  |
| `filter`                        | 19.9       | 18.4        | **0.93&#215;**  |
| `native-filter-product-names`   | 19.9       | 18.5        | **0.93&#215;**  |</code></pre></div><p>What these have in common: they&#8217;re the heaviest queries. Calcite pushes more predicates into join conditions, giving Postgres more scope to optimize. At 30s per query, 12% is 3-4 seconds saved per execution.</p><p>Interesting outlier:<strong> </strong><code>basic-select</code><strong> on Postgres is 0.26&#215; (4&#215; faster) under Calcite</strong>. This was a bit of a surprise. EXPLAIN ANALYZE shows Postgres picks a slightly different plan &#8212; the Calcite SQL doesn&#8217;t include the unused time_by_day table Mondrian&#8217;s legacy path drags in for cardinality probing. Less scanning, less aggregation work.</p><h3>Where it gets fun: the MV matcher</h3><p>Mondrian supports aggregate tables. You declare them in schema XML:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;xml&quot;,&quot;nodeId&quot;:&quot;51ac9f0d-3cd0-42e6-9345-f8b8a7e3774c&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-xml">&lt;MeasureGroup table=&#8217;agg_c_14_sales_fact_1997&#8217; type=&#8217;aggregate&#8217;&gt;

  &lt;Measures&gt;

    &lt;MeasureRef name=&#8217;Unit Sales&#8217; aggColumn=&#8217;unit_sales&#8217;/&gt;

    &lt;!-- ... --&gt;

  &lt;/Measures&gt;

  &lt;DimensionLinks&gt;

    &lt;ForeignKeyLink dimension=&#8217;Store&#8217; foreignKeyColumn=&#8217;store_id&#8217;/&gt;

    &lt;CopyLink dimension=&#8217;Time&#8217; attribute=&#8217;Month&#8217;&gt;

      &lt;Column aggColumn=&#8217;the_year&#8217; table=&#8217;time_by_day&#8217; name=&#8217;the_year&#8217;/&gt;

    &lt;/CopyLink&gt;

    &lt;!-- ... --&gt;

  &lt;/DimensionLinks&gt;

&lt;/MeasureGroup&gt;</code></pre></div><p>Mondrian&#8217;s own logic picks one of these when the grain matches a query. But <strong>only if the operator sets </strong><code>mondrian.rolap.UseAggregates=true</code><strong> and </strong><code>mondrian.rolap.ReadAggregates=true</code>. Global flags. Off by default. Most deployments I&#8217;ve seen have them off because they weren&#8217;t documented prominently, and leaving them off is safe.</p><p>So: queries against your big fact table scan the big fact table, even though you declared a handy 2.6k-row pre-aggregate that perfectly answers the query.</p><p>Calcite&#8217;s answer: a <code>MaterializedViewRule</code> in its planner that does cost-based MV selection. Register the declared aggregates as `RelOptMaterialization` entries, and the rule rewrites queries that subsume a materialization to scan the materialization instead. No operator flag needed.</p><p>I built the infrastructure: <code>MvRegistry</code> walks the schema&#8217;s aggregate MeasureGroups, builds a <code>RelOptMaterialization</code> per declared agg. Registered with <code>VolcanoPlanner</code>. PK/FK metadata surfaced so Calcite&#8217;s Golds<code>ein-Larson </code>duplication-preservation check could fire.</p><p>And the rule didn&#8217;t fire. At all. Ever.</p><h3>The wall I hit</h3><p>Calcite&#8217;s MV rule family uses <code>SubstitutionVisitor</code> internally. It compares the user query&#8217;s <code>RelNode</code> tree against the MV&#8217;s defining-query `RelNode` tree and returns substitutions when one subsumes the other. Structural match is strict:</p><ul><li><p>User query&#8217;s <code>RelNode</code> shape: <code>Aggregate(Scan(fact) Join Scan(dim)) </code>&#8212; 1 join.</p></li><li><p>MV&#8217;s defining query: <code>Aggregate(Scan(fact) Join Scan(dim1) Join Scan(dim2) Join Scan(dim3) Join Scan(dim4))</code> &#8212; 5 joins.</p></li></ul><p>SubstitutionVisitor has no way to prove the MV&#8217;s 4 extra joins are &#8220;safe to drop&#8221; without FD/uniqueness metadata. Even with PK/FK metadata surfaced, rule coverage was incomplete for the denormalized-CopyLink shapes FoodMart&#8217;s aggregates use.</p><p>After 6 hours of iteration, the rule was rewriting zero queries.</p><h3>Option D: stop fighting, write the matcher</h3><p>Calcite&#8217;s MV framework is general-purpose. Our use case is narrow: given a Mondrian <code>&lt;MeasureGroup type=&#8217;aggregate&#8217;&gt; </code>declaration, match user queries whose shape matches the declared agg.</p><p>I wrote a 280-line <code>MvMatcher</code> that walks each registered <code>ShapeSpec</code> and matches user <code>PlannerRequest</code>s directly:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;5c99bd5b-fd65-47ea-8973-f41da9c20e05&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">if MV&#8217;s group-by columns &#8839; user&#8217;s group-by columns

   AND user&#8217;s measure columns are pre-aggregated on the MV

   AND user&#8217;s filters reference columns the MV carries

   &#8594; rewrite factTable to aggTable, drop dropped joins, translate columns</code></pre></div><p>Deterministic. Bypasses <code>SubstitutionVisitor</code> entirely. No trait/convention wrestling.</p><p>Unit tests proved all 4 MvHit corpus queries rewrite correctly to agg-table SQL.</p><p>Then I ran the Postgres perf benchmark. <strong>Zero effect. </strong>The matcher wasn&#8217;t firing at runtime.</p><h3>The second wall</h3><p>The planner cache (from Y.2) was attached, but with no <code>MvRegistry</code> because the first caller to<code> plannerFor(DataSource) </code>didn&#8217;t pass the schema. Cache stored a registry-less planner. Subsequent calls with schemas hit the cache and returned the registry-less planner.</p><p>Fix: extended <code>plannerFor(DataSource, RolapSchema)</code> and made both <code>SegmentLoader</code> and <code>SqlTupleReader</code> dispatch seams pass the schema. Late-bind the registry if a schema arrives on a cache-hit.</p><p>Re-ran the benchmark. <strong>Still zero effect.</strong></p><h3>The third wall</h3><p>Added a trace to <code>CalcitePlannerCache</code>: the registry built <code>had size() == 0.</code> Why?</p><p><code>MvRegistry.fromSchema</code> walks <code>rolapSchema.getCubeList()</code>. On the first call, before Mondrian finishes cube initialization, that list is empty. My late-bind code checked &#8220;is registry null?&#8221; &#8212; and a size-0 registry isn&#8217;t null.</p><p>Fix: retry the registry build when size == 0 AND the caller has a schema. Once size &gt; 0, stop retrying.</p><p>Re-ran the benchmark. <strong>Now 2 of 4 MvHit queries rewrite.</strong> 10&#215; speedup on those. But the other two didn&#8217;t match.</p><h3>The fourth wall</h3><p>The other two MvHit queries use <code>agg_g_ms_pcat_sales_fact_1997</code>, which has <code>gender</code> and <code>marital_status</code> CopyLinked from <code>customer</code> (not reached via FK join). My shape catalog&#8217;s <code>family-gender</code> shape declared 2 group-by columns: <code>product_family</code>, <code>gender</code>. But runtime MDX always slices by <code>[Time].[Year]</code> under Mondrian&#8217;s <code>hasAll=&#8217;false&#8217;</code> setting, so the runtime <code>groupBy</code> is actually<code> [the_year, product_family, gender] </code>&#8212; three columns.</p><p>Fix: added <code>year-family-gender</code> and <code>year-family-gender-marital </code>shape variants.</p><p>Re-ran. <strong>All 4 match.</strong> Plus <code>crossjoin</code> and <code>calc-iif-numeric</code> started drifting.</p><h3>The fifth wall (self-inflicted)</h3><p>Adding shapes is easy. Over-matching is easy. My new <code>year-family-gender-marital</code> shape (4 group columns) also matched <code>crossjoin</code>&#8217;s user query <code>[the_year, gender, marital_status] </code>(3 columns) because the matcher accepted &#8220;user is a subset&#8221; &#8212; semantically valid SUM-over-SUM rollup from a finer-grained MV.</p><p>That rewrite is mathematically correct. But the agg&#8217;s pre-aggregated rows are at <code>(year, family, gender, marital_status, quarter, month)</code> grain. Summing <code>SELECT year, gender, marital_status, SUM(unit_sales) FROM agg GROUP BY year, gender, marital_status</code> gives the right totals &#8212; and a completely different row iteration order than the base-fact scan. Different row order, different checksum, <code>LEGACY_DRIFT</code>.</p><p>Fix: require EXACT group-by-set-size equality between user request and shape. If you want a coarser-grained match, declare a coarser-grained shape. Explicit beats inferred.</p><p>Re-ran.<strong> 44/44 harness green, 4/4 MvHit queries rewrite, 10&#215; speedup.</strong></p><h3>Final MvHit numbers (Postgres, UseAggregates=false)</h3><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;markdown&quot;,&quot;nodeId&quot;:&quot;ff789164-9657-4880-aed4-e33d4171c359&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-markdown">| Query                                  | Legacy (fact scan, 86M rows)   | Calcite (matcher agg scan, ~2.6k rows) | Speedup.  |
|----------------------------------------|-------------------------------:|---------------------------------------:|----------:|
| `agg-g-ms-pcat-family-gender`          | 12.5s                          | 1.3s                                   | **10.0&#215;** |
| `agg-c-year-country`                   | 10.0s                          | 1.2s                                   | **8.6&#215;**  |
| `agg-c-quarter-country`                | 11.7s                          | 1.2s                                   | **9.7&#215;**  |
| `agg-g-ms-pcat-family-gender-marital`  | 13.3s                          | 1.2s                                   | **10.9&#215;** |
| **Geomean**                            |                                |                                        | **9.7&#215;**  |</code></pre></div><p>No operator flag needed. Calcite sees the declared aggregates, matches the query, rewrites. User just asks &#8220;show me sales by year and country&#8221; and gets the answer in 1.2 seconds instead of 12.</p><h2>The benchmarking detour</h2><p>At one point my benchmark showed <code>agg-g-ms-pcat-family-gender-marital</code> was <strong>2.22&#215; slower</strong> under Calcite than legacy. Other queries fine. Wrote a diagnostic, ran EXPLAIN ANALYZE, compared plans, stared at query plans for two hours.</p><p>The reason: I was launching both Mondrian JVMs in parallel against the same Postgres:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;25321d73-201a-434b-88f0-43fe246b54bc&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">mvn ... -Dmondrian.backend=legacy &amp;

mvn ... -Dmondrian.backend=calcite &amp;</code></pre></div><p>Both JVMs hit the same 8-core laptop and the same Postgres instance. Whichever cell&#8217;s heaviest query happened to overlap with the other cell&#8217;s heaviest query got disproportionately slow. Non-deterministic. Looked like a regression on one specific query.</p><p>Sequential runs showed D/B = 1.02&#215; for that query. No regression. Just contention.</p><p>Lesson: <strong>never run head-to-head database benchmarks in parallel against the same database instance. </strong>It&#8217;s obvious in retrospect. I lost 4 hours to it.</p><h2>What I&#8217;d do differently</h2><ul><li><p><strong>Instrument first, optimize second.</strong> The Y.2 fix was obvious once I had phase-by-phase timings. I spent days before that trying to make Calcite emit &#8220;better&#8221; SQL when the real cost was metadata reflection.</p></li><li><p><strong>Don&#8217;t fight general-purpose frameworks.</strong> Calcite&#8217;s <code>MaterializedViewRule</code> is powerful and brittle. For a constrained use case (matching Mondrian&#8217;s declared aggregates), a 280-line hand-rolled matcher beat 6 hours of trying to make the general rule fire.</p></li><li><p><strong>Checksum row order as a proxy for correctness is sharp.</strong> Identical aggregated values in different row order trip LEGACY_DRIFT. Fine for catching real regressions. Also fine for catching your own harmless changes. Know which you&#8217;re seeing.</p></li><li><p><strong>Sequential benchmarks.</strong> Always.</p></li></ul><h2>By the numbers</h2><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;markdown&quot;,&quot;nodeId&quot;:&quot;482d8daa-9b38-4dd4-a187-6295872f5822&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-markdown">|                                                        | Measurement           |
|--------------------------------------------------------|:---------------------:|
| Production commits on the branch                       | 66                    |
| Lines of new Java (main/)                              | ~3 500                |
| Lines of test Java (test/)                             | ~2 800                |
| Harness MDX queries                                    | 45                    |
| Legacy dialect classes deleted (yet)                   | 0                     |
| Legacy dialect classes queued for deletion             | 30+                   |
| **Calcite speedup vs legacy (HSQLDB geomean)**         | **4.57&#215;**             |
| **Calcite speedup vs legacy (Postgres geomean)**       | **1.04&#215;**             |
| **MV matcher speedup (Postgres, UseAggregates=false)** | **9.74&#215;**             |
| Y.2 fix &#8212; lines changed                                | 3                     |
| Y.2 fix &#8212; perf impact                                  | 36 pp (1.27&#215; &#8594; 0.93&#215;) |</code></pre></div><p>The whole rewrite took roughly 5 days of focused work (3 worktrees + all the investigation). Worktree 4 (deletion) is probably another day. Then the blog has a cleaner ending.</p><h2>What this unlocks</h2><p>The real value of this isn&#8217;t 4.57&#215; on HSQLDB. HSQLDB isn&#8217;t exactly a production warehouse. The real value is:</p><ol><li><p><strong>A generic dialect path</strong>. Want to support ClickHouse? DuckDB? Snowflake? Calcite has dialects. Add one line to <code>CalciteDialectMap</code>. Done. No more 300-line <code>MySqlDialect</code> subclasses.</p></li><li><p><strong>Automatic aggregate-table rewriting</strong>. No configuration flag required. Declare your aggregates in schema XML, and queries hit them.</p></li><li><p><strong>A path to real cost-based optimization</strong>. The VolcanoPlanner infrastructure is in place. Someone who wants cost-based MV selection, join reordering, or custom rules has somewhere to put them.</p></li><li><p><strong>Less surface area for bugs</strong>. 30 hand-tuned dialect subclasses are 30 places for NULL-ordering quirks to hide. Calcite has one tested <code>SqlDialect</code> implementation per database.</p></li></ol><p>The Postgres parity is the interesting one to sit with. Calcite isn&#8217;t making Postgres queries slower. It isn&#8217;t making them meaningfully faster on plan quality either. But it IS making the code a lot smaller, cleaner, and easier to extend. Those are wins you can&#8217;t benchmark.</p><p></p>]]></content:encoded></item><item><title><![CDATA[Not All AI is Created Equal: Three Questions to Cut Through the Noise]]></title><description><![CDATA[&#8220;Where do you want to go today?&#8221;]]></description><link>https://articles.concepttocloud.com/p/not-all-ai-is-created-equal-three</link><guid isPermaLink="false">https://articles.concepttocloud.com/p/not-all-ai-is-created-equal-three</guid><dc:creator><![CDATA[Tom Barber]]></dc:creator><pubDate>Mon, 16 Feb 2026 12:13:10 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1771189958069-a6b00817825c?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxODB8fHN0YXRpc3RpY3N8ZW58MHx8fHwxNzcxMjQzOTQ0fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Microsoft&#8217;s campaign slogan from the mid 90s. And does anyone remember the MPC badge? The Multimedia PC certification that got slapped on &#8220;approved&#8221; machines? The mid 90s were the extremes of the multimedia wars. PCs competing to become the best device for watching videos, playing MP3s, consuming all the data you could handle, assuming it was downloadable on a 4kbps dial-up connection in between the 30-minute disconnections.</p><p>Everything got the multimedia label. If it had speakers and a CD-ROM drive, it was a Multimedia PC. The badge told you almost nothing about what the machine could actually do, but it sold units.</p><p>Fast forward 30 years (and yes, I feel old too) and we&#8217;re living through exactly the same thing with AI. If it plugs into a wall, it&#8217;ll have some AI strapline somewhere, it seems. Your PC, your phone, your watch, your fridge, your microwave, your oven. Everything is &#8220;AI-ready&#8221; or &#8220;AI-powered.&#8221; The label is everywhere, and it tells you almost nothing.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1771189958069-a6b00817825c?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxODB8fHN0YXRpc3RpY3N8ZW58MHx8fHwxNzcxMjQzOTQ0fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1771189958069-a6b00817825c?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxODB8fHN0YXRpc3RpY3N8ZW58MHx8fHwxNzcxMjQzOTQ0fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1771189958069-a6b00817825c?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxODB8fHN0YXRpc3RpY3N8ZW58MHx8fHwxNzcxMjQzOTQ0fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1771189958069-a6b00817825c?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxODB8fHN0YXRpc3RpY3N8ZW58MHx8fHwxNzcxMjQzOTQ0fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1771189958069-a6b00817825c?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxODB8fHN0YXRpc3RpY3N8ZW58MHx8fHwxNzcxMjQzOTQ0fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1771189958069-a6b00817825c?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxODB8fHN0YXRpc3RpY3N8ZW58MHx8fHwxNzcxMjQzOTQ0fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="7008" height="4672" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1771189958069-a6b00817825c?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxODB8fHN0YXRpc3RpY3N8ZW58MHx8fHwxNzcxMjQzOTQ0fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:4672,&quot;width&quot;:7008,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Close-up of computer circuit board with many chips&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Close-up of computer circuit board with many chips" title="Close-up of computer circuit board with many chips" srcset="https://images.unsplash.com/photo-1771189958069-a6b00817825c?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxODB8fHN0YXRpc3RpY3N8ZW58MHx8fHwxNzcxMjQzOTQ0fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1771189958069-a6b00817825c?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxODB8fHN0YXRpc3RpY3N8ZW58MHx8fHwxNzcxMjQzOTQ0fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1771189958069-a6b00817825c?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxODB8fHN0YXRpc3RpY3N8ZW58MHx8fHwxNzcxMjQzOTQ0fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1771189958069-a6b00817825c?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxODB8fHN0YXRpc3RpY3N8ZW58MHx8fHwxNzcxMjQzOTQ0fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="https://unsplash.com/@jakubpabis">Jakub Pabis</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><p></p><h2>The Problem With a Three-Letter Word</h2><p>The reason &#8220;AI&#8221; has become such a useless label isn&#8217;t that people are lying. It&#8217;s that the term covers such an absurdly wide range of technology that it&#8217;s effectively meaningless without qualification. It&#8217;s like saying a vehicle has an engine. Okay, is it a lawnmower or a 747?</p><p>The mainstream conversation right now is dominated by Large Language Models. The chatbots, the text generators, the things most people picture when they hear &#8220;AI&#8221; in 2025. But the tech world seems to have collectively forgotten that AI has been around for a lot longer than the past few years. The foundations were laid by Alan Turing and others in the 1940s and 50s. The first artificial neural network, SNARC, appeared in 1951. The first program that could learn from its own mistakes, a checkers-playing system, arrived in 1952.</p><p>My own career has taken me across the gamut of computer science and ML projects, and while I&#8217;m no academic researcher, I&#8217;ve worked alongside some very clever people building very different types of models to solve very different problems. Statistical models for fraud detection. Computer vision systems for scientific instruments. Recommendation engines. Sensor monitoring. All of it &#8220; and none of it an LLM.</p><p>So when everything gets called &#8220;AI-ready,&#8221; the question that should follow is: <em>what kind?</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1583916833408-c5a6287c67de?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx3aW5kb3dzJTIwOTV8ZW58MHx8fHwxNzcxMjQzNjk5fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1583916833408-c5a6287c67de?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx3aW5kb3dzJTIwOTV8ZW58MHx8fHwxNzcxMjQzNjk5fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1583916833408-c5a6287c67de?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx3aW5kb3dzJTIwOTV8ZW58MHx8fHwxNzcxMjQzNjk5fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1583916833408-c5a6287c67de?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx3aW5kb3dzJTIwOTV8ZW58MHx8fHwxNzcxMjQzNjk5fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1583916833408-c5a6287c67de?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx3aW5kb3dzJTIwOTV8ZW58MHx8fHwxNzcxMjQzNjk5fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1583916833408-c5a6287c67de?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx3aW5kb3dzJTIwOTV8ZW58MHx8fHwxNzcxMjQzNjk5fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="5456" height="3064" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1583916833408-c5a6287c67de?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx3aW5kb3dzJTIwOTV8ZW58MHx8fHwxNzcxMjQzNjk5fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:3064,&quot;width&quot;:5456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;white and blue UNK UNK device&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="white and blue UNK UNK device" title="white and blue UNK UNK device" srcset="https://images.unsplash.com/photo-1583916833408-c5a6287c67de?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx3aW5kb3dzJTIwOTV8ZW58MHx8fHwxNzcxMjQzNjk5fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1583916833408-c5a6287c67de?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx3aW5kb3dzJTIwOTV8ZW58MHx8fHwxNzcxMjQzNjk5fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1583916833408-c5a6287c67de?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx3aW5kb3dzJTIwOTV8ZW58MHx8fHwxNzcxMjQzNjk5fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1583916833408-c5a6287c67de?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx3aW5kb3dzJTIwOTV8ZW58MHx8fHwxNzcxMjQzNjk5fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="https://unsplash.com/@necit15">Nikita Zaitsev</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><p></p><h2>Three Questions That Actually Matter</h2><p>Instead of accepting the AI label at face value, I&#8217;d suggest asking three questions. They won&#8217;t make you an expert, but they&#8217;ll cut through the marketing fog faster than anything else.</p><h3>1. Is it learning, or is it following rules?</h3><p>This is the most basic distinction, and it&#8217;s the one most often obscured by marketing. A genuine machine learning model improves its performance based on data. It identifies patterns, adjusts, and gets better over time. A rules engine executes a predefined set of if/else logic that a human wrote.</p><p>Both are perfectly valid tools. Sometimes a well-crafted rules engine is exactly what you need. It&#8217;s fast, predictable, and easy to audit. There is absolutely nothing wrong with that. But calling it &#8220;AI-powered&#8221; sets an expectation that the system is doing something it isn&#8217;t. If your vendor can&#8217;t clearly explain whether their product is learning from data or following a script, that&#8217;s a red flag, not because rules engines are bad, but because the distinction matters for how you evaluate, trust, and maintain the thing.</p><h3>2. Is the output deterministic or probabilistic?</h3><p>This is the question that I think matters most, and it&#8217;s the one that gets lost almost entirely in the current conversation.</p><p>A deterministic model, given the same input, will give you the same output every time. You can test it, validate it, explain it, and predict its behaviour. A probabilistic model gives you its best guess, a plausible output that might differ next time you ask the exact same question. If you&#8217;ve ever asked ChatGPT the same thing twice and got different answers, you&#8217;ve experienced this firsthand.</p><p>Neither approach is inherently better. They serve different purposes. But the implications for trust, governance, and risk are completely different.</p><p>Consider financial transaction monitoring. If you&#8217;re flagging potentially suspicious activity, do you want a model that gives you the same risk score every time for the same transaction? Or one that might score it differently on a different day? Both approaches exist, both have their place. But you&#8217;d better know which one you&#8217;re deploying and why, because the regulatory, audit, and compliance implications are worlds apart.</p><p>The same applies in healthcare, manufacturing, legal tech, anywhere the stakes are real. A deterministic model that classifies a tumour the same way every time is a fundamentally different tool from a probabilistic one that gives you its best interpretation. Both might be called AI. The governance they require couldn&#8217;t be more different.</p><p>This is also where a lot of the &#8220;boring&#8221; AI lives. The workhorses that don&#8217;t make headlines but run critical infrastructure. Fraud scoring on your credit card. Pricing algorithms when you book a flight. Predictive maintenance on industrial equipment. These models are often deterministic, explainable, and battle-tested over years of data. Nobody writes breathless articles about them, but they&#8217;re doing higher-stakes work than most LLM deployments. Nobody dies if a chatbot writes a bad email. A misclassified fraud signal or a missed anomaly in a sensor reading is a different story entirely.</p><h3>3. Is it making decisions, or informing them?</h3><p>The final question is about autonomy, and it changes everything about risk, liability, and trust.</p><p>A model that flags a suspicious transaction for a human analyst to review is a fundamentally different proposition from one that automatically blocks the transaction. A system that highlights a potential defect on a production line for an inspector to check is not the same as one that rejects the part without human involvement. A tool that drafts an email for you to edit is not the same as one that sends it.</p><p>The governance, liability, and customer experience implications shift dramatically based on where the human sits in the loop. An AI system that surfaces information and lets a person decide carries one kind of risk. An autonomous system that acts carries another entirely. Both might be labelled &#8220;AI-powered.&#8221; The label doesn&#8217;t tell you which one you&#8217;re buying.</p><h2>The Label Will Eventually Become Meaningless, and That&#8217;s Fine</h2><p>The multimedia wars resolved themselves. We stopped caring about the MPC badge and started caring about what the machine could actually do. The same will happen with AI. Eventually, the marketing label will fade, and people will just talk about what the technology does, how it works, and whether it&#8217;s the right fit for the problem.</p><p>But right now, we&#8217;re in the messy middle. The label is everywhere and means almost nothing. Every product is &#8220;AI-ready.&#8221; Every vendor has an AI story. The companies and leaders that will get the most value from this era aren&#8217;t the ones buying everything with an AI sticker on it. They&#8217;re the ones who understand what kind of AI solves their specific problem and can tell the difference between a lawnmower and a 747.</p><p>So the next time someone pitches you an AI-powered product, skip the demos and the buzzwords. Just ask three questions: Is it learning or following rules? Is the output deterministic or probabilistic? Is it making decisions or informing them?</p><p>If they can answer clearly, you&#8217;re probably talking to someone who knows what they&#8217;ve built. If they can&#8217;t, well, that tells you something too.</p><p><em>Where do you want to go today?</em> Turns out, the real question was always: what&#8217;s actually under the hood?</p>]]></content:encoded></item><item><title><![CDATA[Design Partners in Engineering Decisions: When and How to Include Them]]></title><description><![CDATA[When you&#8217;re building software to analyse data from a rover 200 million kilometres away, you don&#8217;t get many chances to iterate.]]></description><link>https://articles.concepttocloud.com/p/design-partners-in-engineering-decisions</link><guid isPermaLink="false">https://articles.concepttocloud.com/p/design-partners-in-engineering-decisions</guid><dc:creator><![CDATA[Tom Barber]]></dc:creator><pubDate>Mon, 05 Jan 2026 11:03:25 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!s9Du!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F942b978f-9e19-431a-91e1-1608f684bc2f_7200x3580.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>When you&#8217;re building software to analyse data from a rover 200 million kilometres away, you don&#8217;t get many chances to iterate. The feedback loop is measured in light-minutes, not sprint cycles. But the real lesson I learned working on PIXL at NASA&#8217;s Jet Propulsion Laboratory wasn&#8217;t about the constraints of interplanetary communication. It was about what happens when you treat design as foundational rather than decorative.</p><p>The scientists using Pixlise, the software we built to analyse X-ray spectrometry data from Mars, went from taking roughly a year to publish papers to doing it in weeks. That&#8217;s not because we wrote better code. It&#8217;s because we understood how they actually worked before we wrote any code at all.</p><p>Most engineering leaders I talk to don&#8217;t dispute that design matters. They just think it can wait. Ship first, polish later. Get something in front of users and iterate. The intention is good. Stay lean, move fast, avoid over-engineering. But there&#8217;s a hidden assumption in that approach: that design is about polish. That it&#8217;s the paint you apply after the structure is built.</p><p>It isn&#8217;t. And that misunderstanding costs more than most teams realise.</p><div><hr></div><h1>The False Economy of &#8220;We&#8217;ll Figure Out the UX Later&#8221;</h1><p>The pressure to ship is real. I run a consultancy and I see it constantly. Stakeholders who want a working prototype by next month. Investors who want to see progress. Clients who&#8217;ve already announced launch dates. Stopping to do user research feels like a luxury when the sprint board is overflowing.</p><p>So what happens? Engineers become designers by default. We sketch out interfaces based on how we think the data should flow. We make assumptions about what users want because asking them would take too long. We ship something that works, in the technical sense, and wait for feedback.</p><p>The feedback arrives. It&#8217;s not what we expected.</p><p><a href="https://www.bcg.com/publications/2024/software-projects-dont-have-to-be-late-costly-and-irrelevant">BCG research</a> shows that nearly half of C-suite executives report that more than 30% of their tech projects run over budget and late. The reasons they cite include misalignment between technical and business teams, unrealistic timelines, and insufficient resources. But underneath those symptoms there&#8217;s a quieter problem. Teams building software that solves the wrong problem. Or solves the right problem in a way that doesn&#8217;t fit how users actually work.</p><p><a href="https://en.wikipedia.org/wiki/Barry_Boehm">Barry Boehm</a> calculated in the 1980s that fixing a defect in production costs roughly 100 times more than catching it during design. A study at Ricoh put it more starkly: the cost of fixing a design defect was $35 during design and $690,000 in field service. These numbers vary by context but the pattern is consistent. The later you discover a mismatch between what you built and what users need, the more expensive it becomes to fix.</p><p>And yet we keep placing the same bet. That we&#8217;ll be able to course-correct after launch. Sometimes we&#8217;re right. Often we&#8217;re not. And the lurking usability problems become emergencies that consume the engineering team&#8217;s roadmap for months.</p><div><hr></div><h1>What &#8220;Design&#8221; Actually Means Here</h1><p>When I say design should be included early, I&#8217;m not talking about colour palettes or button placement. I&#8217;m talking about a specific kind of work that engineers often don&#8217;t have time or training to do well.</p><ol><li><p>User research is the practice of understanding who will actually use this software, what they&#8217;re trying to accomplish, and how they currently get that work done. It&#8217;s not about asking people what features they want. It&#8217;s about observing and interviewing to understand the shape of the problem.</p></li><li><p>Interaction design is figuring out how users will accomplish their goals through the software. What&#8217;s the flow? What happens when things go wrong? Where are the decision points and how do we support them?</p></li><li><p>Design systems are shared vocabularies. Components, patterns, and conventions that both designers and engineers can reference. They reduce ambiguity and make it easier to build consistently.</p></li><li><p>Service design zooms out further. It&#8217;s about the processes people use, not just the screens they interact with. How does this software fit into their broader workflow?</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!s9Du!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F942b978f-9e19-431a-91e1-1608f684bc2f_7200x3580.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!s9Du!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F942b978f-9e19-431a-91e1-1608f684bc2f_7200x3580.jpeg 424w, https://substackcdn.com/image/fetch/$s_!s9Du!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F942b978f-9e19-431a-91e1-1608f684bc2f_7200x3580.jpeg 848w, https://substackcdn.com/image/fetch/$s_!s9Du!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F942b978f-9e19-431a-91e1-1608f684bc2f_7200x3580.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!s9Du!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F942b978f-9e19-431a-91e1-1608f684bc2f_7200x3580.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!s9Du!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F942b978f-9e19-431a-91e1-1608f684bc2f_7200x3580.jpeg" width="1456" height="724" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/942b978f-9e19-431a-91e1-1608f684bc2f_7200x3580.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:724,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5690410,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://articles.concepttocloud.com/i/183079862?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F942b978f-9e19-431a-91e1-1608f684bc2f_7200x3580.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!s9Du!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F942b978f-9e19-431a-91e1-1608f684bc2f_7200x3580.jpeg 424w, https://substackcdn.com/image/fetch/$s_!s9Du!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F942b978f-9e19-431a-91e1-1608f684bc2f_7200x3580.jpeg 848w, https://substackcdn.com/image/fetch/$s_!s9Du!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F942b978f-9e19-431a-91e1-1608f684bc2f_7200x3580.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!s9Du!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F942b978f-9e19-431a-91e1-1608f684bc2f_7200x3580.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>At JPL, the Human Centered Design group takes this seriously. As one of their senior designers put it, &#8220;By designing our processes around the needs of the people who use them, we not only save the taxpayers money by making people more efficient, we maximise the science we can get out of each mission.&#8221; That&#8217;s the frame. Design as leverage for outcomes, not decoration.</p><div><hr></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://articles.concepttocloud.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Idea Ignition: Fueling Startups from Concept to Cloud is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h1>What This Looks Like in Practice: The PIXL Story</h1><p>PIXL, the Planetary Instrument for X-ray Lithochemistry, is an X-ray spectrometer mounted on the Perseverance rover. It analyses the chemical composition of Martian rocks at a level of detail that helps scientists understand the planet&#8217;s geological history and search for signs of ancient microbial life.</p><p>The instrument generates enormous amounts of data. Before Pixlise, the scientists analysing that data relied on Excel-based pipelines they&#8217;d built over years. Those pipelines worked. Scientists are resourceful. But they didn&#8217;t enable collaboration. Each researcher had their own spreadsheets, their own methods, their own version of the truth.</p><p>We built Pixlise to change that. It&#8217;s an open-source web application that allows scientists and researchers to analyse PIXL data in near real-time, collaboratively, while retaining the flexibility of their existing workflows. They can share insights, build on each other&#8217;s work, and move from raw data to published findings faster than ever before.</p><p>But here&#8217;s the thing. That outcome wasn&#8217;t inevitable. It happened because we had a dedicated UX research function embedded in the project from the start.</p><p>The researchers didn&#8217;t just sketch interfaces. They aggressively interviewed different members of the user base. Scientists, researchers, anyone with an interest in analysing rover data. They watched how people worked. They asked questions that engineers wouldn&#8217;t think to ask, because engineers are focused on what&#8217;s technically possible, not on what matches how scientists actually think.</p><p>What emerged from that research shaped the product in ways that wouldn&#8217;t have happened otherwise.</p><p>The charts had to make sense to users, not just to engineers. Technical accuracy wasn&#8217;t enough. The visualisations needed to match the mental models that planetary scientists already had. The way they think about spectrometry data, the comparisons they instinctively want to make.</p><p>The interaction between visual data and analysis tools had to feel natural. Scientists are looking at images from Mars alongside spectral data. The way those two things connect in the interface matters enormously. Get it wrong and you&#8217;re fighting the tool instead of using it.</p><p>The scripting and function-writing capabilities had to extend existing workflows, not replace them. Scientists weren&#8217;t going to abandon their Excel pipelines overnight. We needed to augment what they already did, not demand they start from scratch.</p><p>None of this would have been obvious from the requirements document. It came from research.</p><p>The result? Scientists felt ownership of the product. They&#8217;d been consulted, heard, and involved. They weren&#8217;t being handed something built for them. They were using something built with them.</p><p>And the measurable outcome speaks for itself. The time from data collection to published paper dropped from roughly a year to weeks. That&#8217;s not a marginal improvement. That&#8217;s a transformation in how science gets done.</p><div><hr></div><h2>Technical Feasibility Workshops: Where Design Meets Engineering</h2><p>Early design involvement doesn&#8217;t mean designers go off and create something in isolation, then hand it to engineering. That&#8217;s just moving the handoff earlier. The real value comes when design and engineering work together from the start.</p><p>One practical way to do this is the technical feasibility workshop. A structured session early in a project where both disciplines surface constraints together.</p><p>When to run one: Before architecture decisions are locked. During discovery or definition phases, when you&#8217;re still figuring out what you&#8217;re building.</p><p>Who&#8217;s in the room: At minimum, an engineering lead and someone responsible for design or user research. Ideally also a product owner and, if at all possible, an actual user or customer.</p><p><strong>What you cover:</strong></p><ol><li><p><em>User story walkthrough (designer-led).</em> What is the user trying to accomplish? What does success look like for them? Where are the pain points today?</p></li><li><p><em>Technical constraint mapping (engineer-led).</em> What are the hard constraints? What&#8217;s expensive or risky? What technical debt already exists that will shape what&#8217;s possible?</p></li><li><p><em>Overlap identification (collaborative).</em> Where do user needs and technical constraints intersect? What&#8217;s both desirable and feasible? What trade-offs are we already facing?</p></li><li><p><em>Risk surfacing (collaborative).</em> What happens if we get this wrong? Where are the assumptions we&#8217;re most uncertain about?</p></li><li><p><em>Decision points.</em> What do we need to prototype or test to reduce uncertainty?</p></li></ol><p>This isn&#8217;t a one-time event. Spotify describes their approach as shifting weights. Design and engineering don&#8217;t hand off to each other. They shift who&#8217;s leading at different phases while remaining in the room together throughout. The designer&#8217;s job isn&#8217;t just to create designs. It&#8217;s to facilitate getting to the best design outcome, which requires engineering input from the beginning.</p><p><a href="https://www.getyourguide.careers/posts/building-better-products-through-strong-design-and-engineering-collaboration">GetYourGuide</a> calls it &#8220;handshake instead of handoff.&#8221; The difference is subtle but significant. A handoff implies sequential work. Design finishes, then engineering starts. A handshake implies parallel collaboration. We&#8217;re working on this together, even when one of us is leading.</p><div><hr></div><h1>Collaborative Prototyping: Keeping the Feedback Loop Tight</h1><p>The old model for design-engineering collaboration looked like a relay race. Designers did their work, documented it exhaustively, and threw it over the wall. Engineers picked it up, discovered a dozen reasons why it wouldn&#8217;t work as specified, and either built something different or sent it back for revision.</p><p>The new model keeps both disciplines in the room together.</p><p>Engineers in user research sessions. Not to solve problems, that&#8217;s not the point, but to watch and listen. When engineers see real users struggling with a specific interaction, it changes how they think about technical trade-offs. &#8220;We should refactor this flow&#8221; becomes a hard sell. &#8220;In eight out of ten sessions, users failed to complete this task without backtracking, here&#8217;s the video&#8221; is a different conversation.</p><p>Designers in technical spike reviews. When engineering is exploring whether something is feasible, designers should be there to understand what&#8217;s possible and what&#8217;s expensive. The best design solutions often emerge from understanding constraints, not ignoring them.</p><p>Shared prototypes that evolve together. Rather than a finished design spec that gets implemented, there&#8217;s a prototype that both sides iterate on. Designers adjust based on technical feedback. Engineers adjust based on user feedback. The artifact gets better faster.</p><p>Design systems as living contracts. A good design system isn&#8217;t just a library of components. It&#8217;s a shared agreement between design and engineering about how things work. When both sides contribute to it and reference it, there&#8217;s less ambiguity and less rework.</p><p>The contrast with projects that lack this collaboration is stark. I&#8217;ve worked on projects where there was no formal design process, where we just talked through requirements and started building. Every single time, the result was the same. Larger iterative cycles. Unclear expectations. And eventually a tense conversation where the customer&#8217;s vision turned out to be fundamentally different from what we&#8217;d built.</p><p>These days I refuse to work on projects that don&#8217;t have some element of design thinking baked in. Not because I&#8217;m precious about process. Because I&#8217;ve learned what it costs when you skip it.</p><div><hr></div><h1>&#8220;Engineers Can Figure It Out&#8221; (and Other Expensive Beliefs)</h1><p>When I talk to skeptical engineering leaders about involving design earlier, I hear a few recurring objections.</p><p><strong>&#8220;Engineers can figure it out these days.&#8221;</strong> There&#8217;s some truth here. Engineering skill sets have expanded. Many developers have better design intuition than their predecessors. Tools have improved. But there&#8217;s a difference between being able to make reasonable interface decisions and being skilled at understanding user needs. They&#8217;re different disciplines. An engineer can figure out a plausible solution. A designer with research can figure out the right one.</p><p><strong>&#8220;AI can build it, design just slows us down.&#8221;</strong> This is increasingly common and it gets the causality backwards. AI removes coding friction. It makes implementation faster. But it doesn&#8217;t remove the need to know what to build. In fact, faster implementation makes upfront clarity more valuable, not less. If you can spin up a feature in hours instead of weeks, the bottleneck shifts entirely to knowing which feature to build. Good UX research gives developers a clear understanding of what needs to be built and how interactions should work. AI can still be leveraged for implementation. But research removes the computer-based assumptions from the development cycle.</p><p><strong>&#8220;We don&#8217;t have time or budget for design.&#8221;</strong> This one I take personally because I&#8217;ve lived the alternative.</p><p>Early in my consultancy work I took on a project without any structured design process. We discussed requirements regularly. Had good rapport with the client. Thought we understood what they wanted. We were wrong. The gap between their expectations for the user interface and data flow and what we built only became clear late in the project. I made a financial loss on that engagement. Not because the code was bad, but because we built confidently in the wrong direction.</p><p>That experience taught me two things. First, verbal discussions aren&#8217;t enough. You need structured workshops, visual artifacts, and iterative feedback so the customer sees what&#8217;s being built and can flag misalignment early. Second, the &#8220;cost&#8221; of design isn&#8217;t additive. It&#8217;s preventive. You&#8217;re not spending extra. You&#8217;re avoiding waste.</p><p>The minimum viable design investment isn&#8217;t a full UX team and months of research. It&#8217;s someone asking &#8220;who is this for and how will they use it?&#8221; before any code gets written. It&#8217;s regular feedback sessions with real artifacts, not just status updates. It&#8217;s treating user understanding as an engineering requirement, not a nice-to-have.</p><div><hr></div><h1>Design as an Engineering Practice</h1><p>The lesson from PIXL wasn&#8217;t that NASA has resources most teams don&#8217;t. It&#8217;s that treating design as foundational, as a structural part of how you build rather than a finishing touch, changes outcomes in ways you can measure.</p><p>Scientists publishing papers in weeks instead of a year. Users who feel ownership because they were consulted. Features that match how people actually work, not how engineers assumed they would work. These aren&#8217;t soft benefits. They&#8217;re the kind of outcomes engineering leaders care about.</p><p>If you&#8217;re skeptical, I&#8217;m not asking you to overhaul your process. I&#8217;m asking you to try an experiment. Before your next project kicks off, ask four questions:</p><ol><li><p>Have we talked to the people who will use this?</p></li><li><p>Do we understand how they work today, not just what they say they want?</p></li><li><p>Have we surfaced technical constraints alongside user needs, in the same room, at the same time?</p></li><li><p>Is someone responsible for holding the user&#8217;s perspective throughout the project, not just at the beginning?</p></li></ol><p>If the answer to any of those is no, you&#8217;re placing a bet. Maybe it&#8217;ll pay off. But the research, and my experience, suggest the odds aren&#8217;t in your favour.</p><p>When you&#8217;re building for Mars, you don&#8217;t get many chances to iterate. When you&#8217;re building for anyone else, you might get more chances. But each iteration is more expensive than getting it closer to right the first time. Design isn&#8217;t the opposite of moving fast. It&#8217;s how you avoid moving fast in the wrong direction.</p>]]></content:encoded></item><item><title><![CDATA[From Finger-Pointing to Shared Ownership]]></title><description><![CDATA[Making product trios work when your engineering team is undersized and underwater]]></description><link>https://articles.concepttocloud.com/p/from-finger-pointing-to-shared-ownership</link><guid isPermaLink="false">https://articles.concepttocloud.com/p/from-finger-pointing-to-shared-ownership</guid><dc:creator><![CDATA[Tom Barber]]></dc:creator><pubDate>Mon, 15 Dec 2025 11:03:35 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1626615346714-29763470b5e9?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxibGFtZXxlbnwwfHx8fDE3NjU2ODgxNzB8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>A Familiar Dysfunction</strong></p><p>In many organisations, a core issue is the persistent misalignment between business objectives and IT capabilities. Picture a business racing ahead so quickly that no one has time to tackle bigger-picture problems. Imagine, however, if those efforts were channeled toward creating shared customer value, not just managing internal efficiencies. IT often becomes a bottleneck, overwhelmed by competing requests and tasks, making prioritization nearly impossible. The result? Frustrating finger-pointing all around when that energy could be harnessed to enhance essential work and improve alignment across the organization. By focusing on a cohesive path to meet customer needs, the Trio model can transform frustration into collaborative success.</p><p>In several organisations I&#8217;ve worked in, requests stay in backlog for months due to a lack of prioritisation bandwidth. Frequent urgent escalations, often self-serving, derail progress and force immediate attention. Business teams bypass IT using shadow solutions, lacking confidence in IT delivery. Meanwhile, IT feels reduced to an order-taking role rather than an autonomous, decision-making organisation.</p><p>Not all organisations operate this way, but high-pressure environments can slip into these habits unnoticed. Recovering requires both awareness and courage to rebuild effective, aligned collaboration.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1626615346714-29763470b5e9?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxibGFtZXxlbnwwfHx8fDE3NjU2ODgxNzB8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1626615346714-29763470b5e9?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxibGFtZXxlbnwwfHx8fDE3NjU2ODgxNzB8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1626615346714-29763470b5e9?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxibGFtZXxlbnwwfHx8fDE3NjU2ODgxNzB8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1626615346714-29763470b5e9?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxibGFtZXxlbnwwfHx8fDE3NjU2ODgxNzB8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1626615346714-29763470b5e9?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxibGFtZXxlbnwwfHx8fDE3NjU2ODgxNzB8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1626615346714-29763470b5e9?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxibGFtZXxlbnwwfHx8fDE3NjU2ODgxNzB8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="6000" height="4000" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1626615346714-29763470b5e9?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxibGFtZXxlbnwwfHx8fDE3NjU2ODgxNzB8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:4000,&quot;width&quot;:6000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;man holding stick statue during daytime&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="man holding stick statue during daytime" title="man holding stick statue during daytime" srcset="https://images.unsplash.com/photo-1626615346714-29763470b5e9?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxibGFtZXxlbnwwfHx8fDE3NjU2ODgxNzB8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1626615346714-29763470b5e9?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxibGFtZXxlbnwwfHx8fDE3NjU2ODgxNzB8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1626615346714-29763470b5e9?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxibGFtZXxlbnwwfHx8fDE3NjU2ODgxNzB8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1626615346714-29763470b5e9?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxibGFtZXxlbnwwfHx8fDE3NjU2ODgxNzB8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">It&#8217;s his fault! - Photo by <a href="https://unsplash.com/@christianchomiak">Christian Chomiak</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><div><hr></div><p><strong>Why the Usual Fixes Fail</strong></p><p>The core issue: business and tech are structured as separate entities that hand work back and forth, rather than solving problems together.</p><p>So, of course, what you then end up with is a list of things you&#8217;ve probably already tried:</p><ol><li><p>Hiring more managers</p></li><li><p>Hiring more engineers</p></li></ol><p>This approach doesn&#8217;t fix misalignment; it adds more people who lack context and increases salary costs without resolving core issues. Extra alignment meetings often waste engineering hours through &#8216;status theatre.&#8217; Quantifying this lost productivity can underscore the urgent need for change.</p><p>Next comes the problem of ticketing and prioritisation. This helps some symptoms and increases visibility if managed properly. But it does not address the chaos itself.</p><p>You then add more alignment meetings with leaders, developers, product managers, and engineers. These meetings become status theatre, with participants seeking to outdo one another, leading to team-wide frustration and problems.</p><p>This repeated separation of business and tech creates recurring misalignment and unresolved problems instead of joint solutions.</p><div><hr></div><p><strong>The Trio Model - What It Is and What It Isn&#8217;t</strong></p><p>Today I want to talk about the Trio model. For anyone who doesn&#8217;t understand what this is, it&#8217;s a collaboration pattern about three different parts of the organisation working together. In an engineering light organisation, you may end up having a slightly different trio than the general trio, but let&#8217;s take, for example, a light org where you might end up with:</p><p>&#8226; A business owner (this person would own the problem and the customer relationship)<br>&#8226; A technical lead who owns the feasibility and implementation (probably working as an IT engineer)<br>&#8226; A designer or an analyst or an ops lead (it sort of depends on the problem type)</p><p>These people are expected to work together to enable co-ownership of outcomes, not just a hand-off of tasks. The Trio should rally around customer-centric metrics such as customer satisfaction, net promoter score, time-to-market, or defect rate&#8212;tangible goals all Trio roles can collectively strive to improve, translating shared outcomes into concrete action.</p><p>Of course, the Trio isn&#8217;t a committee or approval layer. It&#8217;s not there to oversee but to bring the right people together before decisions are made&#8212;not after. It&#8217;s also not consensus-driven: three people don&#8217;t mean three votes. Instead, decision rights are explicitly assigned: the business owner has final say over customer priorities, the technical lead makes the call on feasibility and implementation, while the designer, analyst, or ops lead decides how to optimise the user experience, business process, or operations, depending on the issue. The tech lead cannot be vetoed by the business owner on technical feasibility, nor can the business owner be overruled on customer needs. Clear assignment of decision rights prevents gridlock and ensures each key area is covered. It is important to note that the trio is not the entire delivery team. They make directional and trade-off decisions&#8212;they do not execute all the work themselves. In a 10-15-person IT organisation, not everyone can be in a trio; otherwise, core work would stall.</p><p>Trios are not permanent&#8212;they form around defined problems and disband when those problems are resolved. Assigning people to trios without clear problems only recreates unnecessary committees, which this approach avoids.</p><p>A warning: &#8216;the trio owns it&#8217; can quickly become &#8216;no one owns it.&#8217; Each role still has distinct accountability. While there is a shared outcome, each person has specific responsibilities.</p><div><hr></div><p><strong>Making It Work: The Structural Stuff</strong></p><p>So, of course, the question is: which problems actually warrant a TRIO, and which should just be dealt with in the usual manner? This normally falls into several categories.</p><p>Often, the projects warranting a TRIO are ones where you&#8217;ve either got high ambiguity, cross-functional dependencies, or strategic importance, or maybe a combination of all three.</p><p>What often doesn&#8217;t require a TRIO are well-defined requests, business-as-usual work, and pure technical debt that can be dealt with via a single stream.</p><p>Because what the leaders also need to figure out is how to prevent every business stakeholder from demanding their engineer in a TRIO, because that then ends up, as mentioned earlier, with a lot of people in TRIOs and no one to actually do the work.</p><p>Too many Trios spread technical people thin; too few create confusion and inertia. Balance isn&#8217;t a one-time fix. Leaders should make trade-off decisions a regular practice. A lightweight quarterly review of Trio assignments and resources helps sustain balance and builds leadership strength, ensuring strategic and operational goals are met.</p><p>Finding balance here is challenging.</p><p>The next question is about meeting cadence, but the real question here is authority. C-Suite execs tend not to really care about meeting frequency, except that it gives the optics of productivity without necessarily being productive. What they actually care about is: &#8220;Will this create more meetings that waste time or fewer escalations that waste their time?&#8221; By framing authority as a trust dividend, you can link decision autonomy to fewer escalations and faster learning loops. This perspective encourages executives to see granting authority as a strategic investment rather than a risk, fostering a more efficient and responsive organisation.</p><p>The answer depends on whether the Trio has the authority to decide. If every call goes up the chain for approval, the Trio becomes just another layer. The real cadence question is what the Trio can decide without asking permission, and how often they need to meet to exercise that authority. This could be daily, a few times a week, or weekly. The Trio&#8217;s priority matters, too. A long-running Trio may only need to meet once a quarter, and that can work.</p><p>As we talk about decision rights, this is the load-bearing section of making it work. Most organisational dysfunction stems from ambiguity. The business thinks that they set priorities. IT thinks they&#8217;re drowning in competing priorities with no clear ranking because everything comes in super urgent. IT thinks they&#8217;re empowered to push back on bad ideas. The business thinks it&#8217;s IT&#8217;s job to execute whatever ideas come in and that IT&#8217;s being obstructionist on purpose. No one knows who can say no to the CEO&#8217;s pet project. The trio model only works if you can explicitly answer these questions: Pose the &#8216;Who says no?&#8217; test. Challenge your team to summarize their trio&#8217;s escalation rules in one tweet. For instance, &#8216;Only the Tech Lead can veto based on feasibility; Business Owner has final say on customer priorities.&#8217; This brevity test often exposes lingering ambiguity that fuels dysfunction.</p><ul><li><p>Who can green-light work entering the trio&#8217;s scope?</p></li><li><p>Who can kill or descope work that&#8217;s not feasible?</p></li><li><p>What trade-offs can the trio make without escalation?</p></li><li><p>What must escalate? (Things like budget thresholds, timeline changes, scope changes beyond X)</p></li></ul><p>If you don&#8217;t answer these questions and make them prescriptive, you&#8217;ll end up with the same bottlenecks you have now, just with a new name.</p><p>Of course, this then gives you the ability to define success metrics. The real question is shared accountability. Because the classic setup measures IT on delivery velocity and system uptime, and businesses on revenue and customer outcomes, these can point in opposite directions. If the metrics aren&#8217;t shared, the trio becomes adversarial: the business owner pushes for speed, the tech lead pushes for quality, and the third person picks a side or checks out. Which is, of course, very suboptimal. A trio needs at least one shared outcome metric they&#8217;re all accountable for, not IT delivering on time and the business adopting it. But this problem got solved, and here&#8217;s how we&#8217;re measuring it as a team, as a trio that can then be reported up the chain to their C-suite execs.</p><div><hr></div><p><strong>What Makes It Actually Stick: The Leadership Part</strong></p><p>This is the pivot where you tell them the uncomfortable truth: the structural stuff will fail if leadership behaviour doesn&#8217;t change. Be direct but not preachy:</p><ul><li><p><strong>Stop sending &#8220;urgent&#8221; requests directly to IT</strong> that bypass the trios. Every time you do, you undermine the model.</p></li><li><p><strong>Stop letting business leaders commit to timelines without technical input.</strong> If sales promises a feature by Q3 without talking to engineering, you&#8217;ve already broken the system.</p></li><li><p><strong>Protect trio time.</strong> If your 10-person IT team is running 15 trios plus BAU support plus incidents, you don&#8217;t have a trio model - you have a burnout model.</p></li><li><p><strong>Hold business owners accountable for outcomes, not just engineers.</strong> If a project fails, the business sponsor should be in that post-mortem, too.</p></li></ul><p>This is where you earn trust by not just telling them what to build, but what to stop doing.</p><p>Moving on to what actually makes this type of design pattern stick in an organisation, the real truth is that the structural stuff will fail if leadership behaviour doesn&#8217;t change. This is a big part of why many of these processes just don&#8217;t stand the test of time: leadership is the one unwilling to change, not the actual ground-level staff who implement these patterns.</p><p>As an organisation, you need to stop sending urgent requests directly to IT that bypass the trios. Every time you do, you undermine the model. If every request is urgent, there is no way for an IT team to prioritise requests.</p><p>Equally, the business needs to accept that sometimes deliveries will get pushed out because legitimately urgent requests will come in that need to be facilitated and actioned, and that will take time and have a knock-on impact on the delivery of other stuff.</p><p>We also need to stop letting business leaders commit to timelines without technical input because we&#8217;ve seen this time and again. If the sales team promises a feature by Q3 without talking to engineering, you&#8217;ve broken the system because there&#8217;s no input from the people doing the implementation. It&#8217;s just a guarantee that those people will actually adhere to those timelines to understand their existing workload and delivery schedule, and whether that is even feasible.</p><p>Also, it is important to protect Trio&#8217;s time. If your 10-person IT team is running 15 trios, plus BAU support, and an instance, you don&#8217;t have a trio model; you have a burnout model. (Trinkenreich et al., 2023) Everybody is knee-deep in trios trying to plan for the next bit of work, trying to understand what&#8217;s going on, and, of course, trying to do that whilst supporting existing systems and developing new ones is nigh on impossible just due to the nature of the scale of the requirements and the requests coming into that organisation.</p><p>The last point that actually makes this stick is holding business owners accountable for outcomes, not just engineers. If a project fails, the business sponsor should be in that post-mortem too because they are part of this chain. It is not purely an engineering function that causes the failure of a specific plan. It can be a lack of specification or accurate scoping, and it takes more than two to tango. Ensuring that the business is also held accountable for its side of the bargain is important when running post-mortems to understand where the failures actually occurred. If we all just always assumed that the business is right and the IT organisation is wrong, you will forever be in a spiral of finger-pointing, with the IT organisation taking the blame.</p><div><hr></div><p><strong>A Realistic Starting Point</strong></p><p>As we reach the end of this post, it would be nice to offer some actionable pointers people can work with.</p><p>So what I&#8217;m saying to you is: pick one or two high-value, high-friction problem areas inside of your organisation and form trios there. Begin with the following steps: First, select and clearly define the problem areas where a Trio can bring substantial improvements. Second, assign clear roles to each Trio member, ensuring everyone understands their responsibilities and contributions. Third, establish specific metrics to measure success, such as customer satisfaction scores or time-to-market metrics. Finally, schedule a review after a quarter to assess the Trio&#8217;s performance, identify challenges, and make necessary adjustments. Run it for a quarter, see what breaks and iterate over it. It&#8217;s not an overnight change that&#8217;s going to suddenly transform your organisation, but it will enable more direct communication and allow multiple groups to take on responsibility for ensuring that requirements are defined and that everybody has a stake in the outcome. It&#8217;s not just chucked over a wall, and the IT team are expected to pick it up and run with it.</p><p>It&#8217;s also important to acknowledge that this is hard and takes longer than you&#8217;d like. It&#8217;s a difference between a real fix and another process that becomes what I call shelf-ware. We try something, we try it for a week, we decide that it doesn&#8217;t work, and then we chuck it on the shelf and pretend that we&#8217;ve never done it. We continue doing the same things over and over again, leading to significant employee dissatisfaction, churn, and project delivery failures. (Costa et al., 2024) But remember, real change is not easy, and it takes bravery to step out of the comfort zone. Let this be a call to courage for leaders&#8212;be the catalyst for change and sustain the momentum to truly transform your organisation.</p><p>Of course, it takes effort to make this work; it takes buy-in across the organisation. If you don&#8217;t get it, you&#8217;ll never succeed.</p>]]></content:encoded></item><item><title><![CDATA[Why Your Engineers and Product Managers Still Don't Talk]]></title><description><![CDATA[(And How to Fix It)]]></description><link>https://articles.concepttocloud.com/p/why-your-engineers-and-product-managers</link><guid isPermaLink="false">https://articles.concepttocloud.com/p/why-your-engineers-and-product-managers</guid><dc:creator><![CDATA[Tom Barber]]></dc:creator><pubDate>Wed, 10 Dec 2025 10:01:59 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!yKvr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb328be-b75b-4851-b6bf-2009a75fe8c5_1024x608.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Ever worked in a business that has your diary booked up from early morning to late in the afternoon? But you feel like you have all the meetings but still no real progress, still the communication is lacking, you all work remotely and don&#8217;t know what anyone else is doing?</p><p>You spend what feels like weeks in sprint planning, daily standups, twice weekly standups for something else, roadmap review meetings, backlog grooming. And then someone else from the business says &#8220;well actually we need this other thing.....&#8221; and the team just feels like they&#8217;ve been blindsided with some random request and project managers feel like they&#8217;re throwing specs over the wall and hoping they land somewhere useful.</p><p>This feels like it happens at some many different organisations. All the effort going into planning to then find that people are none the wiser, or worse, have literally no idea whats going on in general. The problem here though isn&#8217;t missing rituals, its that the rituals we&#8217;ve put in place are optimised for the wrong thing.</p><p>Cross functional ceremonies, meetings whatever you want to call them are designed to move information or status updates, requirement hand offs, timeline syncs, functional information. Whats the word? Perfunctory: carried out with a minimum of effort or reflection. These meetings answer the &#8220;what&#8221; and the &#8220;when&#8221; but without building the rapport between teams, people and employees needed for &#8220;why does this matter to you?&#8221; or &#8220;what are you worried about?&#8221; These types of questions are really key to the world we live in. Sure timing, requirements, they&#8217;re all important but developers developing without understanding the bigger picture, the concerns that others outside the develop bubble have, will forever be on the back foot.</p><p>I can give you some examples. First up an example of a well run, if not draining standup. So when I worked at NASA JPL we worked on a piece of science software that was required to process data downlinked from NASA&#8217;s Perseverance Rover. For years, 2 or 3 at least I joined nightly standups because I was based in the UK and we had to get Los Angeles, Melbourne and London on the phone at the same time. So 11pm for 20-30 minutes every night was the only time that really worked. Of course this wasn&#8217;t the most productive of times because I&#8217;d already been working all day, but they were effective, we could hand work off, discuss blockers etc without it unduly impacting too many people.</p><p>Then after that I went to work for a medical startup. We had a team on the US East Coast and a team in India. This made life especially hard, trying to keep teams in the US functioning, information handed off and project managers happy, with the scrum masters in India trying to wrangle ill defined requirements and me sat somewhere in the middle trying to both get my own work done and answer questions on both sides of the pond. The problem was, they kept having daily standups but the transactions were very much pure business, no questions were asked, no real conversations were had and as such the business owners were frustrated and the developers jaded.</p><h2>The 5 Dysfunctions</h2><p>So now I must ask the question. Who&#8217;s heard of Lencioni&#8217;s Pyramid? Excellent book and topic of many other blogs. I&#8217;ll just add this one to the list! In 2002 Patrick Lencioni wrote about the 5 dysfunctions of a team. And it walks through the many issues and sticking points teams face as they see to grow together, in doing so Patrick looks at the causes of organisational politics and team failure. Something a lot of us can relate to.</p><p>The five dysfunctions are:</p><ul><li><p>Absence of Trust</p></li><li><p>Fear of Conflict</p></li><li><p>Lack of Commitment</p></li><li><p>Avoidance of Accountability</p></li><li><p>Inattention to Results</p></li></ul><p>The issue with the way a lot of organisations work though is that they operate at the top levels of the pyramid. They operate with a lack of accountability and inatention to results, whilst assuming that the foundations, trust, fear of conflict and lack of commitment already exist. Of course in so many cases they don&#8217;t. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yKvr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb328be-b75b-4851-b6bf-2009a75fe8c5_1024x608.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yKvr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb328be-b75b-4851-b6bf-2009a75fe8c5_1024x608.png 424w, https://substackcdn.com/image/fetch/$s_!yKvr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb328be-b75b-4851-b6bf-2009a75fe8c5_1024x608.png 848w, https://substackcdn.com/image/fetch/$s_!yKvr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb328be-b75b-4851-b6bf-2009a75fe8c5_1024x608.png 1272w, https://substackcdn.com/image/fetch/$s_!yKvr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb328be-b75b-4851-b6bf-2009a75fe8c5_1024x608.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yKvr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb328be-b75b-4851-b6bf-2009a75fe8c5_1024x608.png" width="1024" height="608" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/feb328be-b75b-4851-b6bf-2009a75fe8c5_1024x608.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:608,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!yKvr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb328be-b75b-4851-b6bf-2009a75fe8c5_1024x608.png 424w, https://substackcdn.com/image/fetch/$s_!yKvr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb328be-b75b-4851-b6bf-2009a75fe8c5_1024x608.png 848w, https://substackcdn.com/image/fetch/$s_!yKvr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb328be-b75b-4851-b6bf-2009a75fe8c5_1024x608.png 1272w, https://substackcdn.com/image/fetch/$s_!yKvr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb328be-b75b-4851-b6bf-2009a75fe8c5_1024x608.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><p>Over time though what you will learn is that if your rituals, those meetings you all feel are so important are missing the foundation layers you&#8217;re building on sand and the whole thing will crumble around you.</p><p>So what is the absence of trust? The absence of trust is the unwillingness to be vulnerable within the group or within the wider organisation, which leads to a lack of trust between your peers. Fear of conflict involves seeking artificial harmony over constructive passionate debate and not understanding when to both push back and hold your ground when other people are passionately arguing in a different manner.</p><p>Lack of commitment is around feigning buy-in for group decisions which creates ambiguity throughout the organisation. People flip-flop between different points of view depending on who they&#8217;re speaking to and cause greater work for those who are then empowered to implement the changes.</p><p>Avoidance of accountability involves ducking the responsibility to call peers and superiors on counterproductive behaviour which sets low standards. Therefore people further down the chain are expected to hold themselves to higher standards than the boss is implementing the standards themselves which is counterintuitive. Lastly the inattention to team results which focuses solely on personal success status and ego before team success which of course is detrimental to the team output as a whole.</p><p>Of course this begs the question of what do we do instead? How about starting a meeting between technology groups with Q&amp;A? Not forced icebreakers but structured moments where admitting uncertainty is normalised. I&#8217;ll give you an example: how about an engineering lead opening a planning session with &#8220;here&#8217;s what I don&#8217;t understand about this quarter&#8217;s priorities&#8221; and inviting the PM to do the same? </p><p>Therefore the backwards and forwards normalises the uncertainty between the groups, both dictating what should be implemented and the people doing the implementation work themselves, reducing the friction and allowing everybody to voice their uncertainties as a group.</p><p>On top of rituals that surface vulnerability, how about rituals that practise disagreement? Teams that only agree in meetings and fight in Slack have a ritual problem. What does it look like to build productive friction into cadence? For example, how about a team that introduces a red team rotation into roadmap reviews deliberately asking the questions that would create the friction to ensure that it is discussed during the meeting rather than offline while grumbling into your Slack channel?</p><p>Finally how about rituals that build shared context, not just shared information? The difference between &#8220;Here&#8217;s the roadmap&#8221; and &#8220;Here&#8217;s the trade-off I struggled with and why I landed here&#8221;. For example a founder who replaced quarterly all-hands presentations with smaller cross-functional context sessions. It allows people to be able to discuss smaller chunks of work while also understanding the cause and reasoning behind those decisions being made, rather than a holistic super-high-level top-down view of work that&#8217;s supposed to be carried out across a broad swathe of the engineering function.  We actually saw similar with the medical project we were working on that I mentioned earlier when we split up the groups into much smaller teams, allowing for better communication across different groups while still allowing the managers to track the broader project goals at a higher-level delivery. </p><p>Of course involved in all of this is the conflict resolution dividend. When rituals build trust and normalise disagreement, conflict resolution becomes cheaper. Teams don&#8217;t need to elaborate escalation paths because I&#8217;ve already practised navigating tension in lower stakes settings. So when a problem arises, the fallout and the ability to deal with it is far reduced.</p><p>Over the years I&#8217;ve worked in both types of environments, where engineering functions regularly get overwhelmed by both requests, feedback, and negativity and those that thrive.</p><p>Having worked in various startups in London over the years, we often saw small engineering teams with unrealistic goals and deadlines required to deliver on time. Then project managers would get annoyed when delivery was delayed for reasons that they couldn&#8217;t comprehend. Of course if you had dealt with this at a smaller team level focusing on individual pieces rather than the bigger picture, this may have been alleviated or remediated earlier.</p><p>Equally having explained it in an earlier in this blog, I&#8217;ve worked in many environments where small-scale conflict resolution was dealt with both effectively and efficiently by product managers, project managers, and engineering leaders. That makes a huge difference when it comes to delivering on time and to expectation because there&#8217;s less fear of retribution from within the engineering team should something crop up that would negatively impact the timelines involved.</p><h2>Conclusion</h2><p>I understand why there is pushback when it comes to dealing with relationship building rituals. They obviously feel soft, they don&#8217;t feel like you are contributing directly to the delivery of the product or project, and so it&#8217;s harder to justify a relationship building ritual in a planning document than it would be a sprint review or something of similar nature. Leaders from large corps, non engineering corps etc may see these meetings as indulgent but when you&#8217;re talking about a large swathe of employees, trying to navigate requirements, deliverables and an ever changing tech landscape I believe they are invaluable.</p><p>If you&#8217;re a C-suite reader reading this article, then the return on investment is real but it compounds slowly. You won&#8217;t see results in the next sprint. But when it comes to the next crisis or when your team has to bail the company out because they&#8217;re being asked to do something that wasn&#8217;t in the plans or wasn&#8217;t in the roadmap, that&#8217;s when you&#8217;ll see the real return on investment. When it comes to these relationship-building meetings.</p><p>The organisations that get this right don&#8217;t have more meetings. They have meetings that do different work.</p><p>As we sum this blog post up, the structure of your rituals reveals what you actually believe about how work gets done. If the ceremony is about outputs and timelines, you&#8217;re communicating that relationships are someone else&#8217;s problem and not yours. Over time as we&#8217;ve stated earlier in this blog post, that will lead to larger problems that&#8217;s trickier to solve in ever-shortening timelines.  Of course the other negative impact as well is the happiness and contentment of your staff. That also has a detrimental impact on output as staff get less inclined to do the extra work or put in all the hours as they get frustrated with an ever-ending list of stuff that needs to be completed.</p><p>We start this blog post talking about the requirements for meetings and how those meetings are structured. The organisations that get this right don&#8217;t have more meetings. They have meetings that do different work and communicate the requirements and deliverables in a way that resonates more effectively with the staff at hand.</p><p>It doesn&#8217;t all have to be transactional. It can be very much a relationship-building exercise while still getting the same points across in less time.</p><p> So if you&#8217;re a C-suite executive talking to your product managers or project managers, just have a think about how to best structure those meetings. What you can do as an organisation to increase the output and performance while reducing the friction and the resentment with inside the organisation by improving the relationships between different teams and the dependencies on those teams. They will forever thank you for it as conflict and expectations become a much easier attribute to manage across the organisation.</p>]]></content:encoded></item><item><title><![CDATA[Kubernetes: Overkill or Overlooked?]]></title><description><![CDATA[Do you really need it?]]></description><link>https://articles.concepttocloud.com/p/kubernetes-overkill-or-overlooked</link><guid isPermaLink="false">https://articles.concepttocloud.com/p/kubernetes-overkill-or-overlooked</guid><dc:creator><![CDATA[Tom Barber]]></dc:creator><pubDate>Mon, 01 Dec 2025 15:45:24 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!BdHb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fe82aa6-5097-42bf-972f-b91878f7a55c_1024x608.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>So if you listened to my last <a href="https://www.engineeringevolved.com/">Engineering Evolved</a> podcast then you&#8217;d know that the general answer is no. But obviously it is a little more neuanced than that.</p><p><a href="https://www.linkedin.com/posts/erudenko_devops-cloudinfrastructure-infrastructureascode-activity-7395040411649626112-uVCd?utm_source=share&amp;utm_medium=member_desktop&amp;rcm=ACoAAAFQk0YBHT56IwgtXi2UeWXN6OCVmy_PBdw">Let me direct you to a Linkedin Post</a> that caught my eye the other day because I thought it was a joke, then it became apparant in the comments that he wasn&#8217;t joking.</p><p>Now the argument that you should use &#8220;more modern tooling&#8221; because its hot is one that&#8217;s been raging since the dawn of time. But of course, why change what works? And if a VM and SSH works for you, and you have a small team, have at it!</p><p>So this then leads me on to Kubernetes, what is it and should you use it. Kubernetes span out of Google where their <a href="https://en.wikipedia.org/wiki/Borg_(cluster_manager)">Borg</a> platform was dealing with massive Google sized deployments on a daily basis. Now if that doesn&#8217;t tell you what Kubernetes is good at nothing will. But for those of us who like to hack around on things, Kubernetes basically takes Docker images and a lot of YAML descriptors to deploy your application, or applications into a cluster of machines that then mangages your container, your networking, ingress, egress, permissions etc for you. Sounds great right?</p><p>Well in a number of cases it is great. It can provide a lot of fault tollerance, auto scaling, auto healing, all the things your business may be interested in leveraging and some of that, but not all of it comes out of the box.</p><h2>Downsides</h2><p>Okay so what are the downsides of Kubernetes? Lets start on a downer and finish more positively. Its super complex. Sure every cloud vendor has their own hosted solution, and there are a number of vendors offering up support for their Kubernetes variant if you&#8217;d rather deploy it elsewhere, but that doesn&#8217;t mean its easy to wrap your head around. Azure attempt to use <a href="https://azure.microsoft.com/en-gb/products/app-service">Azure App Service</a> to wrap Kubernetes in something that makes it easier to deploy apps. AWS just launched AKS <a href="https://aws.amazon.com/eks/auto-mode/">Auto Mode</a> in an attempt to reduce the management burden. They all know its hard work. Cluster security, cluster upgrades etc etc they all take planning and forethought on top of managing your own pods. And if you&#8217;re self hosting them your host VM also needs patching, upgrades and remediations.</p><p>On top of this the majority of Kubernetes is driven by YAML in some form or other. Some times this is simple, often its not, but its also the way you manage it, how do you deal with change requests, version control, config drift, different people applying policies etc? All of this requires careful management and its not really a technical problem its more of a policy and people problem, do you have the bandwidth to manage it?</p><p>At smaller scales Kubernetes is just more expensive. This obviously manifests itself in a number of ways and isn&#8217;t just 1 number. For example the additional complexity means more man hours. But it also means you&#8217;ve got more nodes, a control plane (sure it may be &#8220;free&#8221; in a vendor but you&#8217;re paying for it somewhere), more infrastructure for ingress etc. It all adds up. </p><p>Slower developer loops are also a real issue. You can, if you&#8217;re a sadist, run Kubernetes on a local machine. But this still requires the same YAML setup, the same config tweaking, the same maintenance. It can be great to debug issues in the cluster but in reality its still hard to manage and certainly not developer friendly.</p><p>Lastly on the downside, the mirage of Kubernetes portability. Of course Kubernetes is designed somewhat to offer universal configuration across cloud providers, this often doesn&#8217;t really play out in practice. You still need to have a good understanding of your vendors networking, its ingress options, its available storage etc. When you deploy storage, do you need slow disks? Cheap disks? High performance disks? Maybe both, how do you pick them how do you configure your setup to use them? It all adds to the operational complexity of a Kubernetes deployment.</p><h2>Upsides</h2><p>Portability is obviously real. We can deploy docker containers into almost any environment, on prem, in any cloud environment. And whilst there is a degree of needing to understand the environment and what it provides the core product itself probably doesn&#8217;t need any changes making it easily deployable in a range of different environments. Of course the other thing about docker containers, developers can also just run them using Docker Desktop, Podman etc making them runnable across an number of different control services.</p><p>The other part of deploy Kubernetes processes is the declarative infrastructure. Whilst having everything defined as a YAML file can certainly add more control complexity it also is a positive. Once you&#8217;ve described the state you want, Kubernetes will do its best to keep everything in sync. Self healing, rollbacks on failure, consistency across different environments. That all comes from the bit that also makes Kubernetes hard to manage from a process perspective.</p><p>Of course in reality there is a rich ecosystem of additional services, products and people who can all plug into your Kubernetes infrastructure. Products like Argo can provide excellent application management. Prometheus provides out of the box support for monitoring of your processes and so on. There is an endless amount of additional tooling that can both help and hinder depending on what you&#8217;re trying to achieve.</p><p>Resource Isolation provides an element of multi tenancy or security patterns across your resources. Obviously this could be achieved outside of Kubernetes as well but RBAC roles, network boundaries etc all provided out of the box. Now of course, these can be hard to configure and maintain, also you want to hook them into your main security systems, probably easier to do inside of a cloud vendors system but still obscure, ensuring you&#8217;ve got namespaces scoped properly, maintaining that network firewall at the correct place etc. This is what I mean by the man management of the change requests, it only takes one badly changed YAML file to blow that right open, but when it works well, having all your processes on a single scalable cluster can really make managing Kubernetes clusters a breeze.</p><p>Next on the list of positives is the extensibility. If it can&#8217;t do what you want it to the operator framework, the ability to extend and improve the environment is absolutely possible. The APIs have been standardised for a long time and this is what drives that ecosystem we mentioned earlier.</p><p>Deployment patterns can really help you, of course some of these need additional tooling if you&#8217;re going to do them well but things like Rolling Updates, Canary and Blue Green releases are all absolutely doable out of the box. Of course though what you need to ask is, do you really need Blue Green releases? Or did you just read about them and figured they sounded cool? </p><p>Last on the list of positives is observability hooks. One of the biggest problems with running anything inside a container is, &#8220;whats going on inside?&#8221;. Hooks for logs, metrics, traces can really help here, both inside your containers but also the traffic through and across the platform. The ecosystem across the observability sector has really grown over the past decade, standards like OpenZipkin(founded by my friend from back in the JClouds days Adrian Cole) have really aided this drive.</p><h2>So how do I choose?</h2><p>Now this is the million dollar(in some cases actually million dollar) question. What is it you&#8217;re trying to achieve? Can you achieve the same outcomes by doing less? If the answer to this question is &#8220;yes&#8221;, then do less. I honestly believe in the modern tech landscape if you can do something by spinning up a VM and running a process on there, even if that thing sits inside a docker container for convenience, then it probably should. Of course there are times that doesn&#8217;t hold true. But even if its not true, do you still need a Kubernetes cluster? Or will something like Lambda for backend, ECS for frontend, Google Cloud Run, or Azure App Services work just as well?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BdHb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fe82aa6-5097-42bf-972f-b91878f7a55c_1024x608.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BdHb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fe82aa6-5097-42bf-972f-b91878f7a55c_1024x608.png 424w, https://substackcdn.com/image/fetch/$s_!BdHb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fe82aa6-5097-42bf-972f-b91878f7a55c_1024x608.png 848w, https://substackcdn.com/image/fetch/$s_!BdHb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fe82aa6-5097-42bf-972f-b91878f7a55c_1024x608.png 1272w, https://substackcdn.com/image/fetch/$s_!BdHb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fe82aa6-5097-42bf-972f-b91878f7a55c_1024x608.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BdHb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fe82aa6-5097-42bf-972f-b91878f7a55c_1024x608.png" width="424" height="251.75" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9fe82aa6-5097-42bf-972f-b91878f7a55c_1024x608.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:608,&quot;width&quot;:1024,&quot;resizeWidth&quot;:424,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!BdHb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fe82aa6-5097-42bf-972f-b91878f7a55c_1024x608.png 424w, https://substackcdn.com/image/fetch/$s_!BdHb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fe82aa6-5097-42bf-972f-b91878f7a55c_1024x608.png 848w, https://substackcdn.com/image/fetch/$s_!BdHb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fe82aa6-5097-42bf-972f-b91878f7a55c_1024x608.png 1272w, https://substackcdn.com/image/fetch/$s_!BdHb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fe82aa6-5097-42bf-972f-b91878f7a55c_1024x608.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Ponding life choices whilst looking at servers</figcaption></figure></div><p>In the 21st century there are multiple ways to spin this cat. Even if you&#8217;re looking for portability, would it cost more to package your app as a Docker container and then write separate deployment scripts for each cloud providers bog standard application deployment?</p><p>These are the types of questions you need to ask, not just the cost in running the thing, people think instantly to hosting costs, but when you&#8217;ve got a platform as complex as Kubernetes, whats the human cost, both monetarily and in terms of raw time.</p><p>Kubernetes is a powerful beast, in the right places it makes a lot of sense. But in the wrong places it can be a security, maintenance and management disaster waiting to happen.</p>]]></content:encoded></item><item><title><![CDATA[Building a scalable, pluggable, cloud based web crawler.]]></title><description><![CDATA[Why we reinvented the wheel, again...]]></description><link>https://articles.concepttocloud.com/p/building-a-scalable-pluggable-cloud</link><guid isPermaLink="false">https://articles.concepttocloud.com/p/building-a-scalable-pluggable-cloud</guid><dc:creator><![CDATA[Tom Barber]]></dc:creator><pubDate>Mon, 13 Oct 2025 16:05:24 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!MazP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d81be8-2bc3-41b4-8a50-d9356021adbb_3333x1875.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Web crawling, API traversing has existed for as long as the internet has existed. Many crawlers have been before, so why when I was working at the Accelerator for Princeton University did we embark on writing a new one?</p><p>Crawlers are often designed for one task, crawling websites, crawling a specific target, and I wanted to create something that would serve more than one purpose. There is always a lot of replicated code when building separate crawlers and I wanted to remove as much of that as possible whilst also bringing a standardised output format for all the sites we crawled to make the ETL process as easy as possible.</p><p>This is why the distributed crawler was born. It allowed us to crawl both Telegram and the Youtube Data API without having to rebuild huge chunks of very similar crawler logic across both platforms. I also wanted to be able to support multiple cloud backends, quite extensible configuration properties for the researchers who were going to use it, and it have as small a footprint as possible.</p><p>I started writing the Telegram portion of the crawler, as ever building and testing something that required many SIM cards to operate was an interesting challenge. With a Sim farm in my office, I succeeded in connecting the crawler to multiple Telegram API backends using a distributed technique, allowing for quite effective scalability of the platform. Of course, ensuring that the data you collect is both accurate and complete is an interesting challenge when you&#8217;re trying to crawl multiple channels inside of a Telegram account to create a global picture of what&#8217;s happening on the Telegram network. In doing that, it involved an awful lot of manual checking, check points, and check sums to ensure that the platform itself curated the data it needed for the researchers to be able to do what was important to them.</p><p>The other interesting challenge was collating the multimedia that is obviously synonymous with every social media network these days. That involved us looking for files, downloading different file types, processing them, and extracting numerous extra data using OCR techniques and metadata extraction on the files themselves to be able to provide search context and lookups for researchers when looking at specific topics that are embedded in the multimedia themselves.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MazP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d81be8-2bc3-41b4-8a50-d9356021adbb_3333x1875.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MazP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d81be8-2bc3-41b4-8a50-d9356021adbb_3333x1875.png 424w, https://substackcdn.com/image/fetch/$s_!MazP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d81be8-2bc3-41b4-8a50-d9356021adbb_3333x1875.png 848w, https://substackcdn.com/image/fetch/$s_!MazP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d81be8-2bc3-41b4-8a50-d9356021adbb_3333x1875.png 1272w, https://substackcdn.com/image/fetch/$s_!MazP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d81be8-2bc3-41b4-8a50-d9356021adbb_3333x1875.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MazP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d81be8-2bc3-41b4-8a50-d9356021adbb_3333x1875.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/36d81be8-2bc3-41b4-8a50-d9356021adbb_3333x1875.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:78965,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://articles.concepttocloud.com/i/175922158?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d81be8-2bc3-41b4-8a50-d9356021adbb_3333x1875.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MazP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d81be8-2bc3-41b4-8a50-d9356021adbb_3333x1875.png 424w, https://substackcdn.com/image/fetch/$s_!MazP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d81be8-2bc3-41b4-8a50-d9356021adbb_3333x1875.png 848w, https://substackcdn.com/image/fetch/$s_!MazP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d81be8-2bc3-41b4-8a50-d9356021adbb_3333x1875.png 1272w, https://substackcdn.com/image/fetch/$s_!MazP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d81be8-2bc3-41b4-8a50-d9356021adbb_3333x1875.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Once I built out this framework, I realised that I wanted to make it pluggable to add the YouTube Data API. At that point, I leveraged <a href="https://www.claude.com/product/claude-code">Claude Code</a> to help me refactor the existing codebase to a more generic interface that would allow me to plug in different social media endpoints. This was quite effective and it provided a useful entry point for Claude to then write the YouTube API layer leveraging the Golang libraries that YouTube had provided. The YouTube layer was easier to implement in a lot of respects in the fact that it didn&#8217;t require access to SIM cards. However, the YouTube Data API is quite restrictive in terms of both the data you can get and also the amount you can collect. Luckily, though, I was working for Princeton University, and so we made a valid submission to YouTube to allow access to their research API. This is basically exactly the same as the data API, but scaled up for larger volume data access. What you don&#8217;t get access to though with the data or the research API is the video content itself, nor access to the transcriptions for the video content (unless of course you happen to be the owner of that video). This means that you can analyse titles, descriptions, things like engagement metrics with comments and the like in the videos themselves, but you can&#8217;t actually access the content of the video itself. This is not immediately problematic, but could cause problems with people wanting to build upon this in the future. There is a workaround, and we will look to add this to the distributed crawler in the coming future. This is an open-source project called InnerTube. InnerTube leverages a public yet undocumented API that allows access to a number of different YouTube features and functionality that aren&#8217;t readily available inside of their data API. This does include transcriptions that we do not have access to, as well as things like similar videos and things that you would see in the sidebar. The reason for this is that people assume is that smart TVs and other smart devices need access to YouTube without a user being logged in, so they have to have a public API that is available to them. InnerTube cleverly leverages this to allow for it to be able to provide the same service as a library for other applications.</p><p>I mentioned earlier that I wanted to be able to use multiple different backends to be able to support the distributed crawler. The reason for this, of course, is that outside of running it inside of Azure inside of a Kubernetes cluster where we were going to run it, people might want to run this in a range of different places. Being able to leverage different backends was important:</p><ol><li><p>Local mode so you could run it on your laptop (like a lot of people would do)</p></li><li><p>As sort of Azure CLI backend as an SDK back-end to support standard cloud operations</p></li></ol><p>The other one that I implemented inside a distributed crawler was support for a framework called DAPA (a microservices framework from the Linux Foundation). The cool thing about DAPA is that it has a number of different plugins for a whole range of different storage features and functionality. Rather than me having to implement a number of different backends outside of the core ones I&#8217;ve just mentioned, it would allow for people to be able to use different services to store the data in depending on what they were interested in leveraging as their own backends. This gives users a very flexible deployment strategy when it comes to spinning this stuff up, especially in cloud environments and even more so inside of a Kubernetes deployment.</p><p>The other interesting thing with DAPR is that you could also use not just storage but other messaging protocols to be able to send the data around. So rather than persist it straight to disc you could for example send it to a message bus and send that to a different service to process that data in real-time. And so the idea of real-time data processing and streaming for a distributed crawler became a reality inside of the project because we could send that data directly to a message bus directly to Databricks and get it picked up and processed almost instantly.</p><p>The final piece of this puzzle was the use of a unified output format that I alluded to earlier. This format followed very closely to an output specification originally used by Junkopedia, and it is a standard output specification that they use across different social media sites in their service. It made sense for us to be able to mimic it closely so that we could also ingest their data should we need to, to be able to bolster the platform.</p><p>The idea was that every single social media post has basically the same content; it&#8217;s just the metadata that surrounds it is slightly different. So being able to support those different types whilst also making it easy for the platform to output similar formats would allow us to then be able to ingest that data very easily when it hit Databricks. And so regardless of which social media platform you use, you knew that the process to get that data into a readable format would be a lot shorter than if you were using individual crawlers for each different service.</p><p>And so that was basically the distributed crawler. The cool thing about it was I was allowed to open source it. You can find a link to the distributed crawler on GitHub right <a href="https://github.com/researchaccelerator-hub/distributed-crawler">here</a>. That is the platform to date. It supports a number of different backends as we&#8217;ve mentioned. It supports Telegram and YouTube, but we do have plans to add more to it over time. And make it more extensible and more useful for a whole range of different social media sites and probably also just standard website crawling as well into the future.</p>]]></content:encoded></item><item><title><![CDATA[How you can utilise AI in business]]></title><description><![CDATA[There is a lot of choice in AI today, but how can you put it to use?]]></description><link>https://articles.concepttocloud.com/p/how-you-can-utilise-ai-in-business</link><guid isPermaLink="false">https://articles.concepttocloud.com/p/how-you-can-utilise-ai-in-business</guid><dc:creator><![CDATA[Tom Barber]]></dc:creator><pubDate>Mon, 03 Feb 2025 10:02:11 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/156137223/3995284eb4ba6289ced08f37ab78191c.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>In this webinar we look at whats&#8217;s out there, what businesses can leverage and how you can make use of the myriad of different AI options on the market today. </p><p>Gain that competitive edge, leverage what is available, but do so in the way that makes the most sense to your business.</p>]]></content:encoded></item><item><title><![CDATA[Reinventing the way research is delivered]]></title><description><![CDATA[About X Ray Spectroscopy, Open Source software and why researchers shouldn't just default to excel.]]></description><link>https://articles.concepttocloud.com/p/reinventing-the-way-research-is-delivered</link><guid isPermaLink="false">https://articles.concepttocloud.com/p/reinventing-the-way-research-is-delivered</guid><dc:creator><![CDATA[Tom Barber]]></dc:creator><pubDate>Fri, 31 Jan 2025 15:52:11 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/156176453/850fadd7931577ff2082d7164e0cf272.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>At NASA we believed in building in the open and doing something special. And so we built Pixlise which reinvented the way that Spectroscopy research was carried out by the Mars Rover science team. We take a look at how we did it and some of the design choices we made.</p><div class="install-substack-app-embed install-substack-app-embed-web" data-component-name="InstallSubstackAppToDOM"><img class="install-substack-app-embed-img" src="https://substackcdn.com/image/fetch/$s_!2EKk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F162952e6-d461-4416-b5da-64329d4c29fb_800x800.png"><div class="install-substack-app-embed-text"><div class="install-substack-app-header">Get more from Tom Barber in the Substack app</div><div class="install-substack-app-text">Available for iOS and Android</div></div><a href="https://substack.com/app/app-store-redirect?utm_campaign=app-marketing&amp;utm_content=author-post-insert&amp;utm_source=spicule" target="_blank" class="install-substack-app-embed-link"><button class="install-substack-app-embed-btn button primary">Get the app</button></a></div>]]></content:encoded></item><item><title><![CDATA[Is this thing on?]]></title><description><![CDATA[Protobufs a quick overview - A recording from Tom Barber's live video]]></description><link>https://articles.concepttocloud.com/p/is-this-thing-on</link><guid isPermaLink="false">https://articles.concepttocloud.com/p/is-this-thing-on</guid><dc:creator><![CDATA[Tom Barber]]></dc:creator><pubDate>Thu, 30 Jan 2025 14:53:27 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/156098349/64072027caa299de2de9e62678a6328b.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>Protobufs allow developers to create high performance, cross language data objects that can ease the transmission of objects and the evolution of those objects across a data platform. In this video I take a quick look at what they are, why you might use them and when you might steer clear.</p><p>Join me for my next live video in the app</p><div class="install-substack-app-embed install-substack-app-embed-web" data-component-name="InstallSubstackAppToDOM"><img class="install-substack-app-embed-img" src="https://substackcdn.com/image/fetch/$s_!2EKk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F162952e6-d461-4416-b5da-64329d4c29fb_800x800.png"><div class="install-substack-app-embed-text"><div class="install-substack-app-header">Get more from Tom Barber in the Substack app</div><div class="install-substack-app-text">Available for iOS and Android</div></div><a href="https://substack.com/app/app-store-redirect?utm_campaign=app-marketing&amp;utm_content=author-post-insert&amp;utm_source=spicule" target="_blank" class="install-substack-app-embed-link"><button class="install-substack-app-embed-btn button primary">Get the app</button></a></div>]]></content:encoded></item><item><title><![CDATA[The Stakeholder Express: Managing Expectations Across Teams]]></title><description><![CDATA[I got stuck in the wrong place today because of the UK train system, so to that end, I did a LinkedIn Live about the similarities between Trains, Schedules, communication, and Product Delivery.]]></description><link>https://articles.concepttocloud.com/p/the-stakeholder-express-managing</link><guid isPermaLink="false">https://articles.concepttocloud.com/p/the-stakeholder-express-managing</guid><dc:creator><![CDATA[Tom Barber]]></dc:creator><pubDate>Wed, 29 Jan 2025 13:40:10 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/156010281/3fea719aba291dc545c804809acce1a5.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>I got stuck in the wrong place today because of the UK train system, so to that end, I did a LinkedIn Live about the similarities between Trains, Schedules, communication, and Product Delivery.</p><p>Managing stakeholders in cloud architecture and application development is a lot like running a railway system: different passengers (stakeholders) have different destinations (goals), some demand speed (quick features), others want reliability (stability), and no one wants unexpected delays. Here&#8217;s how to align teams, manage expectations, and prevent scope creep without derailing a project.</p>]]></content:encoded></item><item><title><![CDATA[Communication is key!]]></title><description><![CDATA[How should non technical users and developers communicate and collaborate across a business?]]></description><link>https://articles.concepttocloud.com/p/communication-is-key</link><guid isPermaLink="false">https://articles.concepttocloud.com/p/communication-is-key</guid><dc:creator><![CDATA[Tom Barber]]></dc:creator><pubDate>Fri, 24 Jan 2025 15:27:55 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/155621191/429912e0a21fac80f7b5a4bff81c70e2.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>How should business product owners, end users, stakeholders and developers communicate, collaborate and ensure the correct technologies get used and built?</p><p>Here&#8217;s some ideas:</p><ol><li><p>Aligning Business and Technical Goals</p></li><li><p>Understanding the Customer Perspective</p></li><li><p>Bridging the Gap Between Technical Jargon and Business Speak</p></li><li><p>Defining Success Metrics Together</p></li><li><p>Prioritisation Through Feedback Loops</p></li><li><p>The Role of Transparency in Building Trust</p></li><li><p>Fostering a Culture of Continuous Improvement</p></li><li><p>Mitigating Risk Early</p></li><li><p>Celebrating Wins Together</p></li><li><p>Communication Tools and Practices</p></li></ol><p>Find out more in the video!</p>]]></content:encoded></item><item><title><![CDATA[Why Data Modelling Still Matters in the Era of Data Lakes]]></title><description><![CDATA[Data Lakes still need a Data Model.]]></description><link>https://articles.concepttocloud.com/p/why-data-modelling-still-matters</link><guid isPermaLink="false">https://articles.concepttocloud.com/p/why-data-modelling-still-matters</guid><dc:creator><![CDATA[Tom Barber]]></dc:creator><pubDate>Thu, 23 Jan 2025 12:21:16 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/155530759/b2282aed5577c20e3cac2053073b868f.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>In this short video I go through 10 reasons why Data Models still matter.</p><ol><li><p>Bringing Order to the Chaos</p></li><li><p>Performance Optimisation</p></li><li><p>Data Quality and Consistency</p></li><li><p>Interoperability with downstream systems</p></li><li><p>Data Lineage</p></li><li><p>Cross Team Collaboration</p></li><li><p>Facilitating Data Discovery</p></li><li><p>Future-proofing your architecture</p></li><li><p>Improving Query Accuracy</p></li><li><p>Hybrid Data Architectures</p></li></ol><p></p><p>Data models continue to be an essential part of what we do. What other reasons can you think of?</p>]]></content:encoded></item><item><title><![CDATA[Landing zones, control tower, cloud based guardrails]]></title><description><![CDATA[How to get started, maybe, in your cloud journey.]]></description><link>https://articles.concepttocloud.com/p/landing-zones-control-tower-cloud</link><guid isPermaLink="false">https://articles.concepttocloud.com/p/landing-zones-control-tower-cloud</guid><dc:creator><![CDATA[Tom Barber]]></dc:creator><pubDate>Wed, 22 Jan 2025 09:59:49 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/155323308/e008b3fbde82096cda8771168fa1622b.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>I went live yesterday and looked at Landing zones, the Cloud Adoption Framework, the Control Tower, and the Well-Architected Framework and how they might help you get started in your Cloud adoption journey.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://articles.concepttocloud.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://articles.concepttocloud.com/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Cost Saving in the Cloud]]></title><description><![CDATA[Some tips and tricks for saving cash in Cloud based deployments]]></description><link>https://articles.concepttocloud.com/p/cost-saving-in-the-cloud</link><guid isPermaLink="false">https://articles.concepttocloud.com/p/cost-saving-in-the-cloud</guid><dc:creator><![CDATA[Tom Barber]]></dc:creator><pubDate>Tue, 21 Jan 2025 11:43:16 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/155323745/365df49c47aae14dd9182a1c2a009ab1.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>We look at how to save money when deploying workloads into the cloud.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://articles.concepttocloud.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://articles.concepttocloud.com/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Designing Applications For Scale]]></title><description><![CDATA[Designing cloud applications that can scale effectively is crucial for any growing business.]]></description><link>https://articles.concepttocloud.com/p/designing-applications-for-scale</link><guid isPermaLink="false">https://articles.concepttocloud.com/p/designing-applications-for-scale</guid><dc:creator><![CDATA[Tom Barber]]></dc:creator><pubDate>Thu, 07 Nov 2024 19:26:56 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1667372459534-848ec00d4da7?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxtaWNyb3NlcnZpY2V8ZW58MHx8fHwxNzMxMDA3NTAwfDA&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Designing cloud applications that can scale effectively is crucial for any growing business. As digital transformation becomes an industry standard, an application's ability to handle increased demands seamlessly can determine its success or failure. Businesses today are no longer static entities; they evolve rapidly, often experiencing unpredictable growth in user bases, data volumes, and functionality requirements.</p><p>Scalability is the capacity of your system to adapt to changes&#8212;whether it's a sudden surge in traffic or the gradual addition of new services and features. Applications face performance bottlenecks without scalability, leading to slow response times, downtime, and poor user experiences. It's about being prepared for success and future-proofing your architecture to handle shifts in user behaviour and business goals. Who remembers websites getting slashdotted regularly not that long ago? Getting 'slashdotted' refers to the sudden surge in traffic that a website receives after being featured on the technology news site Slashdot, which often overwhelms servers and causes the site to crash or become unresponsive.</p><p>In this post, we'll explore the core principles of designing cloud applications for scale, diving into best practices, patterns, and architectural considerations that help ensure your system can grow alongside your business. We'll look at leveraging cloud-native tools and approaches to maintain performance, reliability, and efficiency, no matter how your user base or data needs to expand.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://articles.concepttocloud.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://articles.concepttocloud.com/subscribe?"><span>Subscribe now</span></a></p><h1><strong>The Fundamentals of Scalability</strong></h1><p>Scalability in cloud applications involves designing your system to handle increased loads efficiently without compromising performance. Scalability comes in two forms:</p><ol><li><p><strong>Vertical Scalability</strong> (Scaling Up): Adding more power (CPU, memory, etc.) to an existing server.</p></li><li><p><strong>Horizontal Scalability</strong> (Scaling Out): Adding more servers to distribute the load.</p></li></ol><p>Horizontal scaling is generally preferred in cloud environments due to its cost efficiency and resilience benefits. Cloud providers like AWS, Azure, and Google Cloud simplify horizontal scaling through managed services, autoscaling features, and distributed architectures.</p><p>Vertical scalability, also known as scaling up, refers to adding more power (CPU, memory, etc.) to an existing server to improve its capacity. This approach can be simpler to implement since it doesn't require application architecture changes. However, it has inherent limitations because a single server can only be scaled up to a certain point before hardware limits are reached. Vertical scalability is often practical when an application has a single point of workload that cannot be easily distributed.</p><p>When deciding between vertical and horizontal scalability, it's essential to consider the nature of your application and growth patterns. Vertical scalability is suitable for smaller, less distributed systems where a quick boost in resources is needed. On the other hand, horizontal scalability is generally preferred for cloud-native applications that need to handle unpredictable and significant growth, as it allows for distributing the workload across multiple instances and reduces the risk of hitting resource limits.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1667372459534-848ec00d4da7?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxtaWNyb3NlcnZpY2V8ZW58MHx8fHwxNzMxMDA3NTAwfDA&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1667372459534-848ec00d4da7?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxtaWNyb3NlcnZpY2V8ZW58MHx8fHwxNzMxMDA3NTAwfDA&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1667372459534-848ec00d4da7?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxtaWNyb3NlcnZpY2V8ZW58MHx8fHwxNzMxMDA3NTAwfDA&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1667372459534-848ec00d4da7?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxtaWNyb3NlcnZpY2V8ZW58MHx8fHwxNzMxMDA3NTAwfDA&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1667372459534-848ec00d4da7?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxtaWNyb3NlcnZpY2V8ZW58MHx8fHwxNzMxMDA3NTAwfDA&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1667372459534-848ec00d4da7?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxtaWNyb3NlcnZpY2V8ZW58MHx8fHwxNzMxMDA3NTAwfDA&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080" width="7680" height="4320" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1667372459534-848ec00d4da7?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxtaWNyb3NlcnZpY2V8ZW58MHx8fHwxNzMxMDA3NTAwfDA&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:4320,&quot;width&quot;:7680,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;background pattern&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="background pattern" title="background pattern" srcset="https://images.unsplash.com/photo-1667372459534-848ec00d4da7?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxtaWNyb3NlcnZpY2V8ZW58MHx8fHwxNzMxMDA3NTAwfDA&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1667372459534-848ec00d4da7?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxtaWNyb3NlcnZpY2V8ZW58MHx8fHwxNzMxMDA3NTAwfDA&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1667372459534-848ec00d4da7?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxtaWNyb3NlcnZpY2V8ZW58MHx8fHwxNzMxMDA3NTAwfDA&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1667372459534-848ec00d4da7?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxtaWNyb3NlcnZpY2V8ZW58MHx8fHwxNzMxMDA3NTAwfDA&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="true">Growtika</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><h1><strong>Key Design Principles</strong></h1><h2><strong>Microservices Architecture</strong></h2><p>Designing applications with a microservices architecture enables different components to scale independently. Unlike monolithic architectures, where all features are tightly coupled, microservices break down functionality into more minor, isolated services that communicate via APIs. This allows you to scale specific application parts based on demand, optimizing resource usage.</p><p>Containers, particularly Kubernetes, provide an ideal platform for deploying microservices. Containers encapsulate each microservice with its dependencies, ensuring consistency across development, testing, and production environments. Kubernetes then orchestrates these containers, automating the deployment, scaling, and management of microservices, which makes it easier to maintain the independence of each service while ensuring the overall system scales smoothly as demand changes.</p><p>While microservices sound appealing due to their scalability and flexibility, they can also add significant complexity to your platform. Managing a distributed system with multiple microservices requires robust orchestration, monitoring, and communication strategies. Issues such as inter-service latency, network failures, and distributed data management can introduce challenges that require careful consideration. The increased complexity means that the benefits of microservices are often best realized when an application has reached a particular scale or level of complexity that justifies the additional overhead.</p><ul><li><p><strong>Example</strong>: In an e-commerce platform, the checkout service can be scaled separately from the product listing service, ensuring peak traffic during sales events doesn't impact overall application performance.</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://articles.concepttocloud.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://articles.concepttocloud.com/subscribe?"><span>Subscribe now</span></a></p><h2><strong>Statelessness</strong></h2><p>A <strong>stateless</strong> architecture simplifies scalability. Stateless components do not retain data between requests, making distributing requests across multiple instances easier. Leveraging stateless designs means that each server can handle any request, simplifying load balancing and reducing the chance of bottlenecks.</p><ul><li><p><strong>Implementation Tip</strong>: Store session information in databases or distributed caching systems like Redis rather than in memory on individual servers.</p></li></ul><h2><strong>Database Scalability</strong></h2><p>Scaling databases can be challenging, but several strategies can help:</p><ul><li><p><strong>Sharding</strong>: Splitting your database into smaller, faster, and more manageable parts.</p></li><li><p><strong>Read Replicas</strong>: Read replicas offload read operations from the primary database, which is particularly useful for read-heavy applications.</p></li><li><p><strong>NoSQL Databases</strong>: Consider using NoSQL databases for extensive, unstructured data. They offer more effortless horizontal scalability compared to traditional SQL databases.</p></li></ul><p>Cloud platforms provide managed database solutions such as Amazon RDS, Azure Cosmos DB, and Google Firestore, which have built-in scaling features to help handle large amounts of data and traffic.</p><h2><strong>Load Balancing</strong></h2><p><strong>Load balancing</strong> is essential for distributing incoming traffic across multiple servers to ensure no single server becomes a bottleneck. Managed load balancers, like AWS Elastic Load Balancing (ELB) or Azure Load Balancer, are critical components for high availability and reliability.</p><p>Load balancing can also include geographical distribution for global applications and routing users to the nearest data centre to reduce latency and improve user experience.</p><h2><strong>Caching Strategies</strong></h2><p>Implementing effective caching can drastically improve application performance and reduce the load on your servers and databases. Use caching layers like CDNs (Content Delivery Networks), such as Cloudflare or Amazon CloudFront, to cache static content close to your users. For dynamic content, in-memory caches like Redis or Memcached help store frequently accessed data and reduce repeated database queries.</p><h2><strong>Autoscaling</strong></h2><p>One of the most significant advantages of cloud computing is the ability to scale resources up or down based on current demand automatically. Autoscaling ensures your application can handle traffic spikes while minimizing costs during low-traffic periods.</p><ul><li><p><strong>Example</strong>: Use AWS Auto Scaling or Azure Virtual Machine Scale Sets to adjust the number of instances running your application dynamically.</p></li></ul><h2><strong>Event-Driven Architectures</strong></h2><p>In event-driven architectures, services communicate via events rather than direct calls, which decouples components and improves scalability. Message brokers like AWS SNS/SQS, Azure Event Grid, or Apache Kafka are commonly used to manage the flow of information between services asynchronously, ensuring that your application can scale under varying loads.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://articles.concepttocloud.com/p/designing-applications-for-scale?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://articles.concepttocloud.com/p/designing-applications-for-scale?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h1><strong>Choosing the Right Cloud Services</strong></h1><p>When designing for scale, managed services can offload much of the complexity of maintaining and scaling infrastructure.</p><ul><li><p><strong>Managed Databases</strong>: Reduce the operational burden using managed database services that handle backups, scaling, and updates.</p></li><li><p><strong>Serverless Computing</strong>: You can execute code without managing servers with services like AWS Lambda or Azure Functions. Serverless computing automatically scales in response to incoming requests, ideal for applications with unpredictable workloads.</p></li><li><p><strong>Containerization</strong>: Using containers with Kubernetes or Amazon ECS allows for easy horizontal scaling and management of microservices, promoting efficient resource usage.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1687300172792-68a13c4e149a?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxzZXJ2ZXIlMjBmYWlsdXJlfGVufDB8fHx8MTczMTAwNzUzMnww&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1687300172792-68a13c4e149a?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxzZXJ2ZXIlMjBmYWlsdXJlfGVufDB8fHx8MTczMTAwNzUzMnww&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1687300172792-68a13c4e149a?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxzZXJ2ZXIlMjBmYWlsdXJlfGVufDB8fHx8MTczMTAwNzUzMnww&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1687300172792-68a13c4e149a?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxzZXJ2ZXIlMjBmYWlsdXJlfGVufDB8fHx8MTczMTAwNzUzMnww&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1687300172792-68a13c4e149a?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxzZXJ2ZXIlMjBmYWlsdXJlfGVufDB8fHx8MTczMTAwNzUzMnww&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1687300172792-68a13c4e149a?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxzZXJ2ZXIlMjBmYWlsdXJlfGVufDB8fHx8MTczMTAwNzUzMnww&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080" width="3765" height="2824" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1687300172792-68a13c4e149a?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxzZXJ2ZXIlMjBmYWlsdXJlfGVufDB8fHx8MTczMTAwNzUzMnww&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2824,&quot;width&quot;:3765,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;a bunch of wires and wires in a room&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="a bunch of wires and wires in a room" title="a bunch of wires and wires in a room" srcset="https://images.unsplash.com/photo-1687300172792-68a13c4e149a?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxzZXJ2ZXIlMjBmYWlsdXJlfGVufDB8fHx8MTczMTAwNzUzMnww&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1687300172792-68a13c4e149a?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxzZXJ2ZXIlMjBmYWlsdXJlfGVufDB8fHx8MTczMTAwNzUzMnww&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1687300172792-68a13c4e149a?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxzZXJ2ZXIlMjBmYWlsdXJlfGVufDB8fHx8MTczMTAwNzUzMnww&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1687300172792-68a13c4e149a?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxzZXJ2ZXIlMjBmYWlsdXJlfGVufDB8fHx8MTczMTAwNzUzMnww&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="true">Ivan N</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><h1><strong>Designing for Failure</strong></h1><p>A core part of designing scalable cloud applications is accounting for failure. Cloud environments are inherently distributed, and failures are inevitable. Here are some practices to build resilience:</p><h2><strong>Redundancy</strong></h2><p>Introduce redundancy at every level&#8212;multiple servers, load balancers, database replicas, etc.- to ensure there are no single points of failure. Multi-AZ (Availability Zone) deployments are crucial for high availability.</p><h2><strong>Graceful Degradation</strong></h2><p>Design your application so that if a part fails, it degrades gracefully rather than crashing entirely. For example, if the recommendation service in your application goes down, the core features should still function.</p><h2><strong>Monitoring and Alerts</strong></h2><p>Use tools like AWS CloudWatch, Azure Monitor, or Prometheus to monitor application performance and set up alerts for unusual behaviour. This proactive approach helps maintain system health and preemptively addresses scaling issues.</p><h1><strong>Cost Optimization Strategies</strong></h1><p>Scaling can lead to increased costs, so it's vital to implement cost optimization strategies.</p><ul><li><p><strong>Right-Sizing Resources</strong>: Continuously analyze the usage of your resources and adjust them to the appropriate size.</p></li><li><p><strong>Spot Instances</strong>: Use spot instances for non-critical workloads, which can significantly reduce costs.</p></li><li><p><strong>Serverless and Autoscaling</strong>: Pay only for what you use with serverless and autoscaling to avoid over-provisioning.</p></li></ul><h1><strong>Real-World Example: Scaling an E-Commerce Application</strong></h1><p>Consider an e-commerce application experiencing fluctuating traffic, with significant spikes during holidays or sales. Here's how you might design it for scale:</p><ol><li><p><strong>Microservices</strong>: Separate the checkout, product catalogue, user management, and payment processing into distinct services.</p></li><li><p><strong>Load Balancing and Autoscaling</strong>: Use load balancers to distribute user requests and set up autoscaling for the checkout and product catalogue services to manage spikes.</p></li><li><p><strong>Database Strategy</strong>: Use read replicas for the product catalogue and shard the user data to improve performance.</p></li><li><p><strong>Caching</strong>: Cache product images and descriptions in a CDN and use Redis for session storage to reduce database load.</p></li><li><p><strong>Event-Driven</strong>: Implement an event-driven order processing system using message queues to handle payment and inventory updates automatically.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1658297063569-162817482fb6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1M3x8ZWNvbW1lcmNlfGVufDB8fHx8MTczMTAwNzU4Nnww&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1658297063569-162817482fb6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1M3x8ZWNvbW1lcmNlfGVufDB8fHx8MTczMTAwNzU4Nnww&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1658297063569-162817482fb6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1M3x8ZWNvbW1lcmNlfGVufDB8fHx8MTczMTAwNzU4Nnww&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1658297063569-162817482fb6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1M3x8ZWNvbW1lcmNlfGVufDB8fHx8MTczMTAwNzU4Nnww&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1658297063569-162817482fb6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1M3x8ZWNvbW1lcmNlfGVufDB8fHx8MTczMTAwNzU4Nnww&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1658297063569-162817482fb6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1M3x8ZWNvbW1lcmNlfGVufDB8fHx8MTczMTAwNzU4Nnww&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080" width="8000" height="6000" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1658297063569-162817482fb6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1M3x8ZWNvbW1lcmNlfGVufDB8fHx8MTczMTAwNzU4Nnww&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:6000,&quot;width&quot;:8000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;a tablet and a laptop&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="a tablet and a laptop" title="a tablet and a laptop" srcset="https://images.unsplash.com/photo-1658297063569-162817482fb6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1M3x8ZWNvbW1lcmNlfGVufDB8fHx8MTczMTAwNzU4Nnww&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1658297063569-162817482fb6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1M3x8ZWNvbW1lcmNlfGVufDB8fHx8MTczMTAwNzU4Nnww&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1658297063569-162817482fb6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1M3x8ZWNvbW1lcmNlfGVufDB8fHx8MTczMTAwNzU4Nnww&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1658297063569-162817482fb6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1M3x8ZWNvbW1lcmNlfGVufDB8fHx8MTczMTAwNzU4Nnww&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=1080 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="true">hookle.app</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><h1><strong>Challenges in Scaling Cloud Applications</strong></h1><p>Scaling cloud applications comes with its own set of challenges, including:</p><ul><li><p><strong>Data Consistency</strong>: Maintaining data consistency across distributed components can be difficult. Strategies like eventual consistency and distributed transactions can help.</p></li><li><p><strong>Latency</strong>: As you scale globally, latency can become an issue. Techniques such as edge computing and data locality can help reduce the effects of latency.</p></li><li><p><strong>Complexity</strong>: Microservices, autoscaling, and distributed databases complicate development and operations. Proper tooling and monitoring are essential to manage this complexity effectively.</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://articles.concepttocloud.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://articles.concepttocloud.com/subscribe?"><span>Subscribe now</span></a></p><h1><strong>Conclusion</strong></h1><p>Designing cloud applications for scale requires careful planning, leveraging cloud-native architectures, and making the most of managed services to handle infrastructure complexity. By breaking down monolithic applications into microservices, implementing stateless components, optimizing databases, and effectively using autoscaling, you can create an architecture that grows with your business needs.</p><p>Remember that scalability is not just about adding resources but optimizing your architecture to handle growth efficiently. By considering resilience, performance, and cost optimization, you can ensure your cloud application scales seamlessly while delivering a consistent user experience.</p>]]></content:encoded></item><item><title><![CDATA[Ethics in IT: Data and Model Training]]></title><description><![CDATA[In this video, we explore the critical role of ethics in data and model training for AI.]]></description><link>https://articles.concepttocloud.com/p/ethics-in-it-data-and-model-training</link><guid isPermaLink="false">https://articles.concepttocloud.com/p/ethics-in-it-data-and-model-training</guid><dc:creator><![CDATA[Tom Barber]]></dc:creator><pubDate>Mon, 04 Nov 2024 13:25:34 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/151152694/077a0bb2435f2e40800a229a58c0071d.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>In this video, we explore the critical role of ethics in data and model training for AI. </p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://articles.concepttocloud.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://articles.concepttocloud.com/subscribe?"><span>Subscribe now</span></a></p><p>From privacy and transparency to fairness and accountability, we dive deep into how tech professionals can create responsible AI that serves society ethically. Whether you&#8217;re a developer, data scientist, or just curious about AI&#8217;s impact on our world, this video is for you!</p>]]></content:encoded></item><item><title><![CDATA[Navigating the Ethical Landscape in AI and Data-Driven Engineering]]></title><description><![CDATA[Balancing Innovation with Responsibility in Today&#8217;s AI-Driven World]]></description><link>https://articles.concepttocloud.com/p/navigating-the-ethical-landscape</link><guid isPermaLink="false">https://articles.concepttocloud.com/p/navigating-the-ethical-landscape</guid><dc:creator><![CDATA[Tom Barber]]></dc:creator><pubDate>Fri, 01 Nov 2024 15:01:44 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!KUzs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06b0ebe3-762e-41d3-8589-3487e29a2f23_400x600.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We all see AI and data-driven technologies used together increasingly in daily life. Hell, it's been my job for the past 15 years, and boy, has it changed a lot in that time. What is changing now, though, is the need for tech leaders to think far more acutely of the innovations while maintaining ethical responsibility. The GDPR doesn&#8217;t seem that long ago but it also feels a distant memory in terms of data processing legislation and so much has changed since then, navigating this landscape requires a thoughtful approach to transparency, fairness and accountability.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KUzs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06b0ebe3-762e-41d3-8589-3487e29a2f23_400x600.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KUzs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06b0ebe3-762e-41d3-8589-3487e29a2f23_400x600.jpeg 424w, https://substackcdn.com/image/fetch/$s_!KUzs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06b0ebe3-762e-41d3-8589-3487e29a2f23_400x600.jpeg 848w, https://substackcdn.com/image/fetch/$s_!KUzs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06b0ebe3-762e-41d3-8589-3487e29a2f23_400x600.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!KUzs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06b0ebe3-762e-41d3-8589-3487e29a2f23_400x600.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KUzs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06b0ebe3-762e-41d3-8589-3487e29a2f23_400x600.jpeg" width="400" height="600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/06b0ebe3-762e-41d3-8589-3487e29a2f23_400x600.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:600,&quot;width&quot;:400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:52159,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KUzs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06b0ebe3-762e-41d3-8589-3487e29a2f23_400x600.jpeg 424w, https://substackcdn.com/image/fetch/$s_!KUzs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06b0ebe3-762e-41d3-8589-3487e29a2f23_400x600.jpeg 848w, https://substackcdn.com/image/fetch/$s_!KUzs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06b0ebe3-762e-41d3-8589-3487e29a2f23_400x600.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!KUzs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06b0ebe3-762e-41d3-8589-3487e29a2f23_400x600.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Ibot the robot (https://www.deviantart.com/fabinus08/art/Ibot-the-robot-1057203715)</figcaption></figure></div><p>One of the biggest challenges, which isn&#8217;t new, is bias. Algorithms, trained on historical data, can inadvertently pick up and leverage societal prejudices, amplifying disparities in critical areas like hiring(Amazon!) healthcare and criminal justice. To help avoid this, teams need to work out how to incorporate diverse data sets and establish more rigorous checks during model training and testing. Building an interdisciplinary approach, inviting insights from social sciences, ethics experts and user feedback, can help uncover and address biases early.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://articles.concepttocloud.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://articles.concepttocloud.com/subscribe?"><span>Subscribe now</span></a></p><p>Data privacy, as touched upon with GDPR is also still very important. With growing attention paid from both regulators and the public, it is essential to prioritize user consent and transparency regarding data. Those of us at the forefront of this technological advancement should argue for clear data handling policies that protect user rights without degrading model performance. Building this level of trust requires going beyond compliance and working to foster a culture that values privacy at its core.</p><p>Finally, in this short article, accountability matters. Who is responsible when AI systems err or data misuse or misrepresentation results in harm? A clear governance framework is essential, detailing responsibilities and escalation procedures. Audits, regular and in-depth ethical reviews, and transparent reporting processes can help ensure that teams remain accountable, but only when implemented properly and executed effectively.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-RNp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa41a2cbb-6eb1-41e3-9a6e-e3293622396c_1024x608.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-RNp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa41a2cbb-6eb1-41e3-9a6e-e3293622396c_1024x608.jpeg 424w, https://substackcdn.com/image/fetch/$s_!-RNp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa41a2cbb-6eb1-41e3-9a6e-e3293622396c_1024x608.jpeg 848w, https://substackcdn.com/image/fetch/$s_!-RNp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa41a2cbb-6eb1-41e3-9a6e-e3293622396c_1024x608.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!-RNp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa41a2cbb-6eb1-41e3-9a6e-e3293622396c_1024x608.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-RNp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa41a2cbb-6eb1-41e3-9a6e-e3293622396c_1024x608.jpeg" width="1024" height="608" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a41a2cbb-6eb1-41e3-9a6e-e3293622396c_1024x608.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:608,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-RNp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa41a2cbb-6eb1-41e3-9a6e-e3293622396c_1024x608.jpeg 424w, https://substackcdn.com/image/fetch/$s_!-RNp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa41a2cbb-6eb1-41e3-9a6e-e3293622396c_1024x608.jpeg 848w, https://substackcdn.com/image/fetch/$s_!-RNp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa41a2cbb-6eb1-41e3-9a6e-e3293622396c_1024x608.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!-RNp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa41a2cbb-6eb1-41e3-9a6e-e3293622396c_1024x608.jpeg 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">A futuristic robot in a cityscape</figcaption></figure></div><p>Ultimately, ethical AI and data-driven engineering are about proactive choices. By embedding these principles into every stage of our work, we can harness technology's transformative power while respecting and protecting the values that bind us as a society. </p>]]></content:encoded></item></channel></rss>