<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Software Engineering Blog | Clear Thinking & Practical Insights — Igna]]></title><description><![CDATA[Software engineering blog focused on real-world problem-solving, clear thinking, and practical insights on building better systems.]]></description><link>https://blog.ignam.com</link><generator>RSS for Node</generator><lastBuildDate>Fri, 05 Jun 2026 20:10:17 GMT</lastBuildDate><atom:link href="https://blog.ignam.com/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Step-by-Step Guide to Configuring a Vite + React + TypeScript Component Library]]></title><description><![CDATA[I want to thank the community for all the love and support shown to my component library template using Vite and React. I appreciate the feedback and recommendations you've shared, and I also thank future developers who use the library!
The purpose o...]]></description><link>https://blog.ignam.com/step-by-step-guide-to-configuring-a-vite-react-typescript-component-library</link><guid isPermaLink="true">https://blog.ignam.com/step-by-step-guide-to-configuring-a-vite-react-typescript-component-library</guid><category><![CDATA[vite]]></category><category><![CDATA[React]]></category><category><![CDATA[TypeScript]]></category><category><![CDATA[Storybook]]></category><category><![CDATA[template]]></category><dc:creator><![CDATA[Ignacio Miranda Figueroa]]></dc:creator><pubDate>Sun, 10 Sep 2023 16:45:40 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1750606482100/386179e8-3b22-470a-aa66-5ae2d6c4d357.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I want to thank the community for all the love and support shown to my <a target="_blank" href="https://github.com/IgnacioNMiranda/vite-component-library-template">component library template</a> using Vite and React. I appreciate the feedback and recommendations you've shared, and I also thank future developers who use the library!</p>
<p>The purpose of this post is to explain how to set up the component library template. You can find all the library's features in the repo's <a target="_blank" href="https://github.com/IgnacioNMiranda/vite-component-library-template/blob/main/README.md">README file</a>, but I'll highlight the most important ones here.</p>
<ol>
<li><p><a target="_blank" href="https://vitejs.dev">Vite</a>: Run and build the project blazingly fast!</p>
</li>
<li><p><a target="_blank" href="https://tailwindcss.com">TailwindCSS 4</a>: Utility classes to define your styling</p>
</li>
<li><p><a target="_blank" href="https://storybook.js.org/">Storybook 9</a>: Components Preview</p>
</li>
<li><p><a target="_blank" href="https://github.com/googleapis/release-please">Release Please</a>: CHANGELOG.md and GitHub tags generation</p>
</li>
<li><p>Version release configuration for both the GitHub package registry and NPM registry.</p>
</li>
</ol>
<p>Without further ado, let's dive into it 🚀</p>
<h1 id="heading-repo-setup">Repo setup</h1>
<p>You can directly create a new repository by clicking the <code>Use this template &gt; Create a new repository</code> button:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1694357727078/6aa00c98-3b88-42ed-9cd5-0bf4075b4923.png" alt class="image--center mx-auto" /></p>
<p>Then you can clone the newly generated repo and install the dependencies using <code>pnpm install</code>. If you don't have pnpm installed, you can always run <code>corepack enable</code> to activate it (only works from Node 18+). You could also use another package manager such as npm or yarn but I'd like pnpm for being faster and more efficient (:</p>
<p>Now you're able to run all the scripts this repository comes with. For example, running <code>pnpm dev</code> will start the Storybook dev server with some example components. You can find the complete list of scripts in the README file or by simply taking a look at the package.json file.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1694358286068/6217322c-4a27-4ae3-9032-a573b1d19175.png" alt class="image--center mx-auto" /></p>
<h1 id="heading-changelog-update-and-version-release">Changelog update and version release</h1>
<p>This repo uses <code>release-please</code>, a tool created by Google that, quotes "automates CHANGELOG generation, the creation of GitHub releases, and version bumps for your projects." unquote.</p>
<p>You can find the GitHub workflow that takes care of this in the <code>.github/workflows/release-please.yml</code> file (more specifically, the first step of the workflow that uses <code>google-github-actions/release-please-action@v3</code>. For it to work we first need to go to the repo's Settings tab and click on the <code>Code and automation &gt; Actions &gt; General</code> section, then scroll down to <code>Workflow permissions</code> and check off the <code>Allow GitHub Actions to create and approve pull requests</code> checkbox.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1694358821443/c054dab8-8a61-4644-85d6-aba4a2940ff6.png" alt class="image--center mx-auto" /></p>
<p>This will allow <code>release-please</code> to create the Pull Request against the <code>main</code> branch that will bump up the version, update the changelog file and finally release the GitHub tag after being merged (The PR is created by a bot so any repo administrator has to manually approve it and merge it).</p>
<h1 id="heading-publishing-the-package">Publishing the package</h1>
<p>The repo is configured to use the NPM registry, which I will explain first. You can skip to the next section if you want to know how to do it using the GitHub package registry.</p>
<h2 id="heading-using-the-npm-package-registry">Using the NPM package registry</h2>
<p>We simply need to get an NPM access token for our GitHub workflow to be able to publish the package to the registry and add it as a repository secret.</p>
<p>Log into <a target="_blank" href="https://www.npmjs.com/">npm</a> and go to the <code>Access Token</code> tab to create a new token. I'll use a classic one for demo purposes.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1694360036253/3548b576-fdde-470d-aa42-06571d3e2bba.png" alt class="image--center mx-auto" /></p>
<p>You can use either the <code>Publish</code> or <code>Automation</code> type as we need the token to be able to publish new versions of our package.</p>
<p>Copy the value of your token and now let's open the repo's Settings tab and go to <code>Security &gt; Secrets and variables &gt; Actions</code>. Add a new repository secret called <strong>NPM_TOKEN</strong> and paste the value of your token.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1694360256268/2901b2eb-4695-48cc-92f6-2c37f1fb20e7.png" alt class="image--center mx-auto" /></p>
<p>And we're done! With this, the <code>release-please.yml</code> workflow will be able to use this token (in line 55) to publish the package to the npm registry 🎉</p>
<h2 id="heading-using-the-github-package-registry">Using the GitHub package registry</h2>
<p>The configuration is pretty straightforward as we don't even need to get a new access token.</p>
<p>The repo has a workflow example in the <code>.github/examples/github-release-please.yml</code> file. It's pretty much the same as the original workflow, just including the following changes:</p>
<ol>
<li><p>Adding the <code>packages: write</code> permission in line 8. This will be enough for the autogenerated <code>GITHUB_TOKEN</code> to have permission to publish the package to the GitHub registry during the <strong>Publish</strong> step.</p>
</li>
<li><p>The registry URL is now <code>https://npm.pkg.github.com</code> (line 40)</p>
</li>
<li><p>We now use the existing <code>secrets.GITHUB_TOKEN</code> as the access token in line 57 instead of having to create another one.</p>
</li>
</ol>
<p>Simply replacing the existing <code>release-please.yml</code> file with the content of the example file is enough for the workflow part.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">You can also create a <a target="_blank" href="https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens">personal access token</a> with more granulated permissions instead of having them in the <code>permissions</code> key inside the yaml file, and create a new repo secret and use it in line 56. What was explained above is just the simplest way to do it.</div>
</div>

<p>Last but not least, go to the <code>package.json</code> and make sure the "name" key uses the organization scope where the package will be published. For instance:</p>
<ul>
<li><p>If I want to link the published package to the same <strong>vite-component-library-template</strong> repo, I'll have to change the name to `@ignacionmiranda/vite-component-library-template".</p>
</li>
<li><p>If your company is called Octocat, then the name would be "@octocat/your-library".</p>
</li>
</ul>
<p>Finally, don't forget to update the <strong>repository.url</strong> value with the URL of your actual repo.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1694610463196/298ae531-45a3-4847-b110-597184378016.png" alt class="image--center mx-auto" /></p>
<p>The above is the simplest explanation I came up with to set this up. If you want to have deeper insight, I encourage you to check the official GitHub docs about <a target="_blank" href="https://docs.github.com/en/actions/publishing-packages/publishing-nodejs-packages">publishing Node.js packages.</a></p>
<h1 id="heading-installing-the-library-as-a-dependency">Installing the library as a dependency</h1>
<h2 id="heading-using-an-npm-package">Using an NPM package</h2>
<p>If your package is public, great! then you can simply go to your frontend application and run <code>pnpm i &lt;your-library&gt;</code> and start using it.</p>
<p>If your package is private, you'll need to log in using the npm cli or a .npmrc file passing your token along with the npm registry and be invited to the npm organization that publishes the package. <a target="_blank" href="https://docs.npmjs.com/about-private-packages">Here</a> you can find some official docs about private NPM packages.</p>
<h2 id="heading-using-a-github-package">Using a GitHub package</h2>
<p>We need some additional steps if we use this approach, but nothing crazy (:</p>
<p>Inside your frontend app, you'll need to create a <code>.npmrc</code> file in the root of the project with the following content:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># The first section is your user name or organization name in a</span>
<span class="hljs-comment"># kebab-case format.</span>

<span class="hljs-comment"># If my username is IgnacioNMiranda, then the first line should be:</span>
<span class="hljs-comment"># @ignacionmiranda:registry=https://npm.pkg.github.com</span>
@&lt;your-org-or-user&gt;:registry=https://npm.pkg.github.com
//npm.pkg.github.com/:_authToken=<span class="hljs-variable">${GITHUB_TOKEN}</span>
</code></pre>
<p>and then add the <strong>GITHUB_TOKEN</strong> variable to your <code>.env</code> file:</p>
<pre><code class="lang-bash"><span class="hljs-built_in">export</span> GITHUB_TOKEN=&lt;your-token&gt;
</code></pre>
<p>The token has to be a <a target="_blank" href="https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens">personal access token</a>, created with at least the <code>read:packages</code> permission in order to download packages from the GitHub Package Registry.</p>
<p>After running <code>source .env</code> or using your preferred tool to load env vars from a <code>.env</code> file into the current console instance, you should be able to install your GitHub package. For example, if the package is called <code>@ignacionmiranda/vite-component-library</code> the command we have to run is: <code>pnpm i @ignacionmiranda/vite-component-library</code>.</p>
<h1 id="heading-using-the-library">Using the library</h1>
<p>Here are some examples of how to use the styles of the library and a React component in a Next.js application.</p>
<pre><code class="lang-javascript"><span class="hljs-comment">/* _app.tsx for pages router, or layout.tsx for App router  */</span>
<span class="hljs-keyword">import</span> <span class="hljs-string">'&lt;your-library&gt;/styles.css'</span>
<span class="hljs-comment">// More imports and your App component ...</span>
</code></pre>
<pre><code class="lang-javascript"><span class="hljs-comment">/* page.tsx */</span>
<span class="hljs-keyword">import</span> { AtButton } <span class="hljs-keyword">from</span> <span class="hljs-string">'&lt;your-library&gt;'</span>
<span class="hljs-comment">// More imports and your Page component...</span>
</code></pre>
<h1 id="heading-extra-testing-the-library-in-a-frontend-app-locally">Extra: Testing the library in a frontend app locally</h1>
<p>There are some times when we want to test the components we're building without having to publish canary, alpha, beta, or whatever versions to the registry. In order to do it we can follow these steps:</p>
<ol>
<li><p>Run <code>pnpm build:lib</code> to build the component library and get the output in the <strong>dist</strong> folder.</p>
</li>
<li><p>Run <code>pnpm pack</code> to create a .tgz file. This has the same content as the <strong>dist</strong> folder and will allow us to install the library in our frontend app locally.<br /> 🚨 Right now, the <strong>pack</strong> command deletes the 'dependencies' and 'devDependencies' keys from the package.json because of the prepack command. As the pack command is normally run during the publish step in the GitHub workflow, it's not intended to be run locally, i.e. to delete these keys from the package.json. Make sure you revert this change after pushing any new commit to your repo.</p>
</li>
<li><p>Go to your frontend app and add your library as a dependency in the package.json. Instead of setting the version, add the path to the .tgz file. For instance: <code>"vite-component-library-template": "../../vite-component-library-template/vite-component-library-template-2.0.4.tgz"</code> .</p>
</li>
<li><p>Install the deps in your frontend app. Now you should be able to see the local changes you did in the library and use them in your local development for the frontend app.</p>
</li>
</ol>
<h1 id="heading-wrapping-up">Wrapping Up</h1>
<p>Now we're capable of setting up a repository that contains a component library, following semantic versioning, being published to a package registry and using it in a frontend application. Hope this is useful and provides an easy explanation of how to set up this kind of application (: it can get really hard to make everything work together, becoming a real mess with all the config code and files. Feel free to create an issue in the repo if you have difficulties with something or if you just want to make recommendations to continue improving it! You can also ping me on Linkedin if you need help with anything else (:</p>
<p>Happy coding!</p>
]]></content:encoded></item><item><title><![CDATA[⚛️⚡ Vite + React + Typescript Component Library Template]]></title><description><![CDATA[A few weeks ago I created a template library using technologies such as Vite, React, Typescript, Vitest, and Storybook. It also manages automatically version releases using GitHub Actions. Just want to share it here with the community:
GitHub Reposit...]]></description><link>https://blog.ignam.com/vite-react-typescript-component-library-template</link><guid isPermaLink="true">https://blog.ignam.com/vite-react-typescript-component-library-template</guid><category><![CDATA[vite]]></category><category><![CDATA[React]]></category><category><![CDATA[TypeScript]]></category><category><![CDATA[Storybook]]></category><dc:creator><![CDATA[Ignacio Miranda Figueroa]]></dc:creator><pubDate>Fri, 24 Feb 2023 16:01:42 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1677256624547/b161d797-4215-47ed-88ad-e23cee5bc4b6.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>A few weeks ago I created a template library using technologies such as Vite, React, Typescript, Vitest, and Storybook. It also manages automatically version releases using GitHub Actions. Just want to share it here with the community:</p>
<p>GitHub Repository: <a target="_blank" href="https://github.com/IgnacioNMiranda/vite-component-library-template">https://github.com/IgnacioNMiranda/vite-component-library-template</a></p>
<p>Storybook Preview: <a target="_blank" href="https://vite-component-library-template.vercel.app/">https://vite-component-library-template.vercel.app/</a></p>
<p>I hope it can be useful for everyone that wants to start personal library projects or maybe to be used in projects for your company 😉 It also would be nice if you support it by giving it a star or mentioning it in the repo you create 😄</p>
<p>Happy coding! :)</p>
]]></content:encoded></item><item><title><![CDATA[Passing params from an Apache Airflow DAG to triggered DAGs using TriggerDagRunOperator]]></title><description><![CDATA[So I was in this situation, struggling for like 5 hours yesterday (yes, the last 5 Friday work hours, the best ones to get stuck with some code) trying to pass parameters using the TriggerDagRunOperator, and wanting to die but at the end achieving it...]]></description><link>https://blog.ignam.com/passing-params-from-an-apache-airflow-dag-to-triggered-dags-using-triggerdagrunoperator</link><guid isPermaLink="true">https://blog.ignam.com/passing-params-from-an-apache-airflow-dag-to-triggered-dags-using-triggerdagrunoperator</guid><category><![CDATA[airflow]]></category><category><![CDATA[apache-airflow]]></category><category><![CDATA[Python]]></category><category><![CDATA[Python 3]]></category><dc:creator><![CDATA[Ignacio Miranda Figueroa]]></dc:creator><pubDate>Sat, 07 Jan 2023 20:54:59 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1673124709607/4405a5d3-aede-42b6-9182-5f6b87d6e393.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>So I was in this situation, struggling for like 5 hours yesterday (yes, the last 5 Friday work hours, the best ones to get stuck with some code) trying to pass parameters using the <a target="_blank" href="https://airflow.apache.org/docs/apache-airflow/stable/_api/airflow/operators/trigger_dagrun/index.html">TriggerDagRunOperator</a>, and wanting to die but at the end achieving it.</p>
<p>Maybe I was just not experienced enough and I fell into a really easy thing to fix but, today I'll show how to do it, so you don't have to struggle as I did 🙂 let's get into it.</p>
<h3 id="heading-use-case">Use Case</h3>
<p>If you want to go straight to the solution you can skip this section.</p>
<p>I had 2 data sources, an ERP and one content environment (from now on I'll call it 'env') from a CMS (if you don't know what a CMS is, I explain a little bit about it in <a target="_blank" href="https://igna.hashnode.dev/robots-and-sitemap-pages-using-nextjs-and-a-headless-cms">this post</a>). I had 2 DAGs that run at the same time (with the same <strong>schedule_interval</strong>) and synced data from the ERP to the CMS. Each DAG syncs a specific type of data to the same env.</p>
<p>Until now, both DAGs were run individually, updating the CMS environment async. The sync process between the 2 data sources is not free of failures so, a new need come up, which was to first create a backup of the env and then sync the data to a new env that is a copy of the old one. If anything goes wrong, we can just switch the environment and delete the broken one.</p>
<p>With this, the 2 DAGs cannot run async anymore, they have to sync the data to the same environment. The proposed solution was to create a new DAG (which I'll call <strong>Wrapper</strong> from now on) that first runs this create-backup-env task and then triggers the 2 DAGs using the TriggerDagRunOperator. Also, these DAGs cannot be executed manually or with a scheduled interval anymore but the Wrapper DAG instead, the create-backup-env task has to always be run first for the 2 DAGs to always push data to the same env and don't push to old envs that will not be used anymore.</p>
<p>Furthermore, the 2 DAGs can receive quite many config parameters to execute or not certain tasks using the <strong>Trigger DAG w/config</strong> feature that Airflow provides, so these parameters have to be also available in the Wrapper DAG.</p>
<h3 id="heading-the-solution">The Solution</h3>
<p>FYI - I simplified the solution a lot but always kept the main components untouched.</p>
<p>To use the TriggerDagRunOperator, we need to define something like this:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Wrapper DAG</span>
<span class="hljs-keyword">from</span> airflow.decorators <span class="hljs-keyword">import</span> task, dag
<span class="hljs-keyword">from</span> airflow.operators.trigger_dagrun <span class="hljs-keyword">import</span> TriggerDagRunOperator
<span class="hljs-keyword">from</span> airflow.operators.python <span class="hljs-keyword">import</span> get_current_context
<span class="hljs-keyword">from</span> airflow.utils.state <span class="hljs-keyword">import</span> State
<span class="hljs-keyword">from</span> datetime <span class="hljs-keyword">import</span> datetime

<span class="hljs-meta">@dag(start_date=datetime(2023, 1, 7), schedule_interval='@daily', catchup=False)</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">wrapper_dag</span>():</span>
<span class="hljs-meta">    @task.python</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create_backup_env</span>():</span>
        print(<span class="hljs-string">'Creating backup env...'</span>)

    trigger_sync_dag_1_task = TriggerDagRunOperator(
        task_id=<span class="hljs-string">'trigger_sync_dag_1'</span>,
        trigger_dag_id=<span class="hljs-string">'sync_dag'</span>,
        wait_for_completion=<span class="hljs-literal">True</span>,
        poke_interval=<span class="hljs-number">60</span>,
        failed_states=[State.FAILED],
    )

    trigger_sync_dag_2_task = TriggerDagRunOperator(...)

<span class="hljs-meta">    @task.python</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">other_task</span>():</span>
        context = get_current_context()
        params = context[<span class="hljs-string">'params'</span>]  <span class="hljs-comment"># Access to context params</span>
        print(params[<span class="hljs-string">'message'</span>])

    create_backup_env() &gt;&gt; [trigger_sync_dag_1_task, trigger_sync_dag_2_task] &gt;&gt; other_task()


wrapper_dag()
</code></pre>
<pre><code class="lang-python"><span class="hljs-comment"># Sync DAG (let's assume we have 2 like this that are pretty similar)</span>
<span class="hljs-keyword">from</span> airflow.decorators <span class="hljs-keyword">import</span> task, dag
<span class="hljs-keyword">from</span> airflow.operators.python <span class="hljs-keyword">import</span> get_current_context
<span class="hljs-keyword">from</span> datetime <span class="hljs-keyword">import</span> datetime
<span class="hljs-keyword">import</span> logging

<span class="hljs-meta">@dag(start_date=datetime(2023, 1, 7), schedule_interval='@daily', catchup=False)</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">sync_dag</span>():</span>
<span class="hljs-meta">    @task.python</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">sync</span>():</span>
        logging.info(<span class="hljs-string">'Syncing data...'</span>)
        <span class="hljs-comment"># Access to context params in order to perform certain tasks</span>
        context = get_current_context()
        params = context[<span class="hljs-string">'params'</span>]
        logging.debug(<span class="hljs-string">f'params: <span class="hljs-subst">{params}</span>'</span>)
        <span class="hljs-keyword">if</span> <span class="hljs-string">'run-task-a'</span> <span class="hljs-keyword">in</span> params <span class="hljs-keyword">and</span> params[<span class="hljs-string">'run-task-a'</span>]:
            logging.info(<span class="hljs-string">'Running task A...'</span>)
        <span class="hljs-keyword">elif</span> <span class="hljs-string">'run-task-b'</span> <span class="hljs-keyword">in</span> params <span class="hljs-keyword">and</span> params[<span class="hljs-string">'run-task-b'</span>]:
            logging.info(<span class="hljs-string">'Running task B...'</span>)
    sync()

sync_dag()
</code></pre>
<p>To access the params object passed to a DAG using the <strong>Trigger DAG w/config</strong> Airflow feature, we can use the <strong>params</strong> key inside the context that we retrieve using the <strong>get_current_context</strong> function. This returns the active DAG run context. We also can use the Jinja template interpolation feature that Airflow provides out of the box. That is using a string like <code>{{ params }}</code> in certain operator-templated fields or properties. (For a deeper insight check the <a target="_blank" href="https://airflow.apache.org/docs/apache-airflow/stable/templates-ref.html">official documentation</a>).</p>
<p>The <strong>TriggerDagRunOperator</strong> supports a field called <strong>conf</strong> that can receive a python dictionary that will be used as the triggered DAG config. It also supports templating, which means we can do the following:</p>
<pre><code class="lang-python">trigger_dag_task = TriggerDagRunOperator(
    task_id=<span class="hljs-string">'trigger_dag'</span>,
    trigger_dag_id=<span class="hljs-string">'triggered_dag'</span>,
    conf=<span class="hljs-string">'{{ params }}'</span>,
    <span class="hljs-comment"># conf='{{ conf }}' also this to pass the DAG conf object</span>
    wait_for_completion=<span class="hljs-literal">True</span>,
    poke_interval=<span class="hljs-number">60</span>,
    failed_states=[State.FAILED],
)
</code></pre>
<p>As I mentioned, the <strong>conf</strong> parameter expects a python dictionary. If we don't pass any config object to the Wrapper DAG it will work though, due to it will interpolate the params object (which is None), not resulting in any error. However, if we pass some parameters (for instance, <code>{"run-task-a": true}</code>) will result in the following error in the TriggerDagRunOperator task instance:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1673119337915/b780c5c9-b822-4d8f-8fb1-0f7c864d1258.png" alt="TriggerDagRunOperator task instance log" class="image--center mx-auto" /></p>
<p>So we have to rewrite our conf param:</p>
<pre><code class="lang-python">trigger_dag_task = TriggerDagRunOperator(
    task_id=<span class="hljs-string">'trigger_dag'</span>,
    trigger_dag_id=<span class="hljs-string">'triggered_dag'</span>,
    <span class="hljs-comment"># You can use whichever key you want. I used 'configuration'.</span>
    conf={<span class="hljs-string">'configuration'</span>: <span class="hljs-string">'{{ params }}'</span>},
    wait_for_completion=<span class="hljs-literal">True</span>,
    poke_interval=<span class="hljs-number">60</span>,
    failed_states=[State.FAILED],
)
</code></pre>
<p>Doing this, we have the following <code>context['params']</code> object available in our triggered DAGs: <code>{'configuration': "{'run-task-a': True}"}</code> .</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1673121340927/79e80b74-26d2-4821-8463-a720f937ffa4.png" alt="Sync task log" class="image--center mx-auto" /></p>
<p>We have 2 problems here. As you can imagine, the 2 Sync DAGs were built using <code>context['params']</code> instead of <code>context['params']['configuration']</code>. Furthermore, we're receiving a string with the python dictionary instead of the dictionary.</p>
<p>To handle this, we'll need to modify our sync DAGs a little bit. We can create a <strong>get_context_params</strong> util function:</p>
<pre><code class="lang-python"><span class="hljs-comment"># dags/utils/common.py</span>
<span class="hljs-keyword">from</span> ast <span class="hljs-keyword">import</span> literal_eval
<span class="hljs-keyword">from</span> airflow.operators.python <span class="hljs-keyword">import</span> get_current_context


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_context_params</span>():</span>
    context = get_current_context()
    params = context[<span class="hljs-string">'params'</span>]
    <span class="hljs-keyword">if</span> <span class="hljs-string">'configuration'</span> <span class="hljs-keyword">in</span> params:
        params = {
            **params,
            **literal_eval(params[<span class="hljs-string">'configuration'</span>])
        }
        <span class="hljs-keyword">del</span> params[<span class="hljs-string">'configuration'</span>]
    <span class="hljs-keyword">return</span> params
</code></pre>
<p>Here we're checking if the <code>params</code> object has a <code>configuration</code> property, if so, we spread the value in the first <code>params</code> object level as a python dictionary using the <strong>literal_eval</strong> function from the <strong>ast</strong> package. This function evaluates a string containing a Python literal, for instance, a Python dictionary. You can <a target="_blank" href="https://docs.python.org/3/library/ast.html#ast.literal_eval">click here</a> to visit the official docs and have a deeper insight into it.</p>
<p>Ultimately, our Sync DAG has to be rewritten as follows:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Sync DAG (let's assume we have 2 like this that are pretty similar)</span>
<span class="hljs-keyword">from</span> airflow.decorators <span class="hljs-keyword">import</span> task, dag
<span class="hljs-keyword">from</span> datetime <span class="hljs-keyword">import</span> datetime
<span class="hljs-keyword">import</span> logging

<span class="hljs-keyword">from</span> dags.utils.common <span class="hljs-keyword">import</span> get_context_params


<span class="hljs-meta">@dag(start_date=datetime(2023, 1, 7), schedule_interval='@daily', catchup=False)</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">sync_dag</span>():</span>
<span class="hljs-meta">    @task.python</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">sync</span>():</span>
        logging.info(<span class="hljs-string">'Syncing data...'</span>)
        <span class="hljs-comment"># Access to context params in order to perform certain tasks</span>
        params = get_context_params()
        logging.debug(params)
        <span class="hljs-keyword">if</span> <span class="hljs-string">'run-task-a'</span> <span class="hljs-keyword">in</span> params <span class="hljs-keyword">and</span> params[<span class="hljs-string">'run-task-a'</span>]:
            logging.info(<span class="hljs-string">'Running task A...'</span>)
        <span class="hljs-keyword">elif</span> <span class="hljs-string">'run-task-b'</span> <span class="hljs-keyword">in</span> params <span class="hljs-keyword">and</span> params[<span class="hljs-string">'run-task-b'</span>]:
            logging.info(<span class="hljs-string">'Running task B...'</span>)
    sync()

sync_dag()
</code></pre>
<p>Now if we run the Wrapper DAG passing the following config object:</p>
<p><code>{"run-task-a": true}</code></p>
<p>We'll get the following result in the <strong>sync</strong> task logs:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1673121462899/56ae94db-781f-45f1-954a-b42ccc68d5fc.png" alt class="image--center mx-auto" /></p>
<p>With this, we're able to pass params from a parent DAG to a triggered DAG without the need of changing too much logic to use the context params (:</p>
<p>If you think I overcomplicated the solution (it's probably the case) I encourage you to leave a comment ^^ then all of us can continue learning (:</p>
<p>You can check the source code <a target="_blank" href="https://github.com/IgnacioNMiranda/trigger-dag-run-params-demo">here</a>. It includes some extra stuff like using the <strong>BranchPythonOperator</strong> to skip the syncs depending on more config parameters.</p>
<p>Thanks for reading!</p>
]]></content:encoded></item><item><title><![CDATA[Robots.txt and sitemap pages using Next.js and a Headless CMS]]></title><description><![CDATA[Search Engine Optimization (SEO) is one of those frontend things that can always get tricky. You can have really good HTML practices, the fastest load times, meta tags or social media images. All of that is going to help a lot to increase the positio...]]></description><link>https://blog.ignam.com/robots-and-sitemap-pages-using-nextjs-and-a-headless-cms</link><guid isPermaLink="true">https://blog.ignam.com/robots-and-sitemap-pages-using-nextjs-and-a-headless-cms</guid><category><![CDATA[Next.js]]></category><category><![CDATA[Contentful]]></category><category><![CDATA[SEO]]></category><category><![CDATA[headless cms]]></category><category><![CDATA[js]]></category><dc:creator><![CDATA[Ignacio Miranda Figueroa]]></dc:creator><pubDate>Wed, 04 Jan 2023 03:29:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1672802897311/86dfd273-c64d-4d30-8a07-293bb085697b.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Search Engine Optimization (SEO) is one of those frontend things that can always get tricky. You can have really good HTML practices, the fastest load times, meta tags or social media images. All of that is going to help a lot to increase the positioning of your site. However, there are always 2 special pages that every site that wants to be well-indexed and crawled by search crawlers must have: the <strong>robots.txt</strong> and <strong>sitemap.xml</strong> pages.</p>
<p>In this post, we'll go through the details of what these pages are and how to build them in a Next.js project fetching data from a Headless Content Management System (CMS). But first of all, what is a CMS?</p>
<h3 id="heading-cms-definition">CMS Definition</h3>
<p>"A CMS, short for content management system, is a software application that allows users to build and manage a website without having to code it from scratch or know how to code at all. [...] With a CMS, you can create, manage, modify, and publish content in a user-friendly interface." (<a target="_blank" href="https://blog.hubspot.com/blog/tabid/6307/bid/7969/what-is-a-cms-and-why-should-you-care.aspx">source</a>)</p>
<p>Okay, we know what a CMS is, but what's with a Headless CMS?</p>
<p>You can think of headless as "detached or decoupled from the website that serves the content, mainly consumed via API". To summarize, a Headless CMS is a Content Management System that is decoupled from the main application and serves its content via API. If you want to go deeper into the definition you can visit the <a target="_blank" href="https://www.storyblok.com/tp/headless-cms-explained">official explanation</a> of one of the current biggest headless CMS.</p>
<p>There are many CMS out there: Storyblok, Drupal, WordPress, Contentful, Strapi, Sanity, among others. Today I'll use Contentful because it's the Headless CMS I have used the most (: but the example should apply quite the same for any.</p>
<p><strong><mark>DISCLAIMER</mark>: This is not a post about Contentful or the basics of any Headless CMS. If you're not familiar with these I encourage you to take a look into the</strong> <a target="_blank" href="https://jamstack.org/headless-cms/"><strong>most popular available options</strong></a> <strong>that best suit your needs.</strong></p>
<p>Let's start talking about the main topics of this post.</p>
<h3 id="heading-robotstxt">Robots.txt</h3>
<blockquote>
<p>"Robots.txt is a text file webmasters create to instruct web robots or crawlers (typically search engine robots) how to crawl pages on their website." (<a target="_blank" href="https://moz.com/learn/seo/robotstxt">source</a>)</p>
</blockquote>
<p>We can achieve the above by telling which robot can crawl our site and which pages they can crawl. The basics for these are the following properties:</p>
<ul>
<li><p><strong>User-agent</strong>: it defines which robots can crawl the site.</p>
</li>
<li><p><strong>Disallow rules</strong>: pages that cannot be crawled.</p>
</li>
<li><p><strong>Allow rules (google-bot only)</strong>: pages that can be crawled.</p>
</li>
<li><p><strong>Crawl-delay:</strong> How many seconds a crawler should wait before loading and crawling page content.</p>
</li>
<li><p><strong>Sitemap</strong>: Where the sitemap page is located.</p>
</li>
</ul>
<p>The Next.js implementation for this page is pretty straightforward and it does not need any CMS, but I didn't want to create a post just to paste this code fragment so here we are, putting everything together c:</p>
<pre><code class="lang-typescript"><span class="hljs-comment">// pages/robots.txt.tsx</span>
<span class="hljs-keyword">import</span> { Component } <span class="hljs-keyword">from</span> <span class="hljs-string">'react'</span>
<span class="hljs-keyword">import</span> { GetServerSidePropsContext } <span class="hljs-keyword">from</span> <span class="hljs-string">'next'</span>

<span class="hljs-keyword">const</span> isArrowed = process.env.NEXT_PUBLIC_ALLOW_CRAWLING <span class="hljs-comment">// 'true' or 'false'</span>
<span class="hljs-keyword">const</span> siteUrl = process.env.ORIGIN_URL <span class="hljs-comment">// https://my.site.com</span>

<span class="hljs-keyword">const</span> allow = <span class="hljs-string">`User-agent: *
Disallow: /500
Disallow: /404
Disallow: /403
Allow: /
Sitemap: <span class="hljs-subst">${siteUrl}</span>/sitemap.xml
`</span>

<span class="hljs-keyword">const</span> disallow = <span class="hljs-string">`User-agent: *
Disallow: /
`</span>

<span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> <span class="hljs-keyword">class</span> RobotTxt <span class="hljs-keyword">extends</span> Component {
  <span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> getInitialProps({ res }: GetServerSidePropsContext): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">void</span>&gt; {
    <span class="hljs-keyword">const</span> robotFile = isArrowed === <span class="hljs-string">'true'</span> ? allow : disallow
    res.writeHead(<span class="hljs-number">200</span>, {
      <span class="hljs-string">'Content-Type'</span>: <span class="hljs-string">'text/plain'</span>,
    })
    res.end(robotFile)
  }
}
</code></pre>
<p>If you're not using Typescript you can just remove the types from the code. What we're doing is applying a config based on if the site supports crawling or not, this is defined using an environment (from now on "env") variable. The code has been written thinking of having multiple envs for the site (where you don't want your development or staging envs to be crawled). If your site only has one you can ignore these and just use the "allow" configuration. The same principle applies to the <strong>ORIGIN_URL</strong> variable.</p>
<p>The <strong>disallow</strong> variable defines that all search engine robots cannot crawl any page of the site.</p>
<p>On the other hand, the <strong>allow</strong> variable defines that every user agent (or just robots) is not allowed to crawl 403, 404 and 500 pages. This is mainly because we don't care about the robots crawling those due to they don't have relevant content (unless your error pages are flashy, funny and have interesting information).</p>
<h3 id="heading-sitemapxml">Sitemap.xml</h3>
<p>Now the real challenging section (kind of) (:</p>
<p>First of all, what a sitemap is? Based on my official patented, personal and not-stolen description:</p>
<blockquote>
<p>"Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL [...] so that search engines can more intelligently crawl the site." (<a target="_blank" href="https://www.sitemaps.org/">source</a>).</p>
</blockquote>
<p>So basically it defines the pages our site has and some additional metadata. The format for this page is the following:</p>
<pre><code class="lang-xml"><span class="hljs-meta">&lt;?xml version="1.0" encoding="UTF-8"?&gt;</span>
  <span class="hljs-tag">&lt;<span class="hljs-name">urlset</span> <span class="hljs-attr">xmlns</span>=<span class="hljs-string">"http://www.sitemaps.org/schemas/sitemap/0.9"</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">url</span>&gt;</span>
      <span class="hljs-tag">&lt;<span class="hljs-name">loc</span>&gt;</span>https://my.site.com<span class="hljs-tag">&lt;/<span class="hljs-name">loc</span>&gt;</span>
      <span class="hljs-tag">&lt;<span class="hljs-name">changefreq</span>&gt;</span>daily<span class="hljs-tag">&lt;/<span class="hljs-name">changefreq</span>&gt;</span>
      <span class="hljs-tag">&lt;<span class="hljs-name">lastmod</span>&gt;</span>2023-01-03<span class="hljs-tag">&lt;/<span class="hljs-name">lastmod</span>&gt;</span>
      <span class="hljs-tag">&lt;<span class="hljs-name">priority</span>&gt;</span>0.8<span class="hljs-tag">&lt;/<span class="hljs-name">priority</span>&gt;</span>
    <span class="hljs-tag">&lt;/<span class="hljs-name">url</span>&gt;</span>
    ...
  <span class="hljs-tag">&lt;/<span class="hljs-name">urlset</span>&gt;</span>
</code></pre>
<p>For each page, we have to define a &lt;url&gt; block with some metadata (there are more properties but these are the most common):</p>
<ol>
<li><p>loc: page's URL.</p>
</li>
<li><p>changefreq: frequency the crawler should check for page changes.</p>
</li>
<li><p>lastmod: page last modified date.</p>
</li>
<li><p>priority: that the page has on the site.</p>
</li>
</ol>
<p>Now the question is, how can we generate this kind of page using data from our favorite Headless CMS?</p>
<p>When using this kind of CMS, pages normally live as entries with fields filled with content (titles, page slugs, banners, sections, headers, footers, etc). Then we fetch these pages and build the UI using some frontend library or framework like Next.js.</p>
<p>First things first, we need to define the Sitemap page component in our application:</p>
<pre><code class="lang-typescript"><span class="hljs-comment">// pages/sitemap.xml.tsx</span>
<span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> <span class="hljs-keyword">class</span> Sitemap <span class="hljs-keyword">extends</span> Component {
  <span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> getInitialProps({ res }: GetServerSidePropsContext): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">void</span>&gt; {
    <span class="hljs-keyword">const</span> pages = <span class="hljs-keyword">await</span> getPages()
    res.writeHead(<span class="hljs-number">200</span>, { <span class="hljs-string">'Content-Type'</span>: <span class="hljs-string">'text/xml'</span> })
    res.write(createSitemap(pages))
    res.end()
  }
}
</code></pre>
<p>Note that we're using 2 functions here:</p>
<ol>
<li><p>An async function called <strong>getPages</strong> that fetches some <em>page</em> data. This will help us to retrieve pages data from our CMS (in this case, Contentful) in an array.</p>
</li>
<li><p>A function called <strong>createSitemap</strong>. It receives the pages data as a parameter.</p>
</li>
</ol>
<p>Let's dive into the second one first (also the easiest one). First of all, I'm gonna define a type for the Contentful pages (you can skip this if you're using vanilla JS):</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">export</span> <span class="hljs-keyword">type</span> ContentfulPage = {
  title: <span class="hljs-built_in">string</span>
  slug: <span class="hljs-built_in">string</span>
  header: ContentfulOrHeader
  blocks?: ContentfulBlock[]
  footer: ContentfulOrFooter
  updatedAt?: <span class="hljs-built_in">string</span>
}
</code></pre>
<p>This is basically how our page is built in Contentful. It has a title, a slug, a header component, some block components that conform the page itself (like banners, sections, cards, etc), a footer and the updatedAt date for the page.</p>
<p>Now the function itself:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> createSitemap = <span class="hljs-function">(<span class="hljs-params">pages: ContentfulPage[]</span>) =&gt;</span> {
  <span class="hljs-keyword">return</span> <span class="hljs-string">`&lt;?xml version="1.0" encoding="UTF-8"?&gt;
  &lt;urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"&gt;
    <span class="hljs-subst">${generateLinks(pages)}</span>
  &lt;/urlset&gt;`</span>
}
</code></pre>
<p>You can see that we're using another function called <strong>generateLinks</strong> that receives our pages data. Let's take a look into it:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> generateLinks = <span class="hljs-function">(<span class="hljs-params">pages: ContentfulPage[]</span>) =&gt;</span> {
  <span class="hljs-keyword">const</span> pageItems = pages.map(<span class="hljs-function">(<span class="hljs-params">page</span>) =&gt;</span> {
    <span class="hljs-keyword">const</span> slugPath = page.slug === <span class="hljs-string">'/'</span> ? <span class="hljs-string">''</span> : <span class="hljs-string">`/<span class="hljs-subst">${page.slug}</span>`</span>
    <span class="hljs-keyword">const</span> url = <span class="hljs-string">`<span class="hljs-subst">${process.env.ORIGIN_URL}</span><span class="hljs-subst">${slugPath}</span>`</span>
    <span class="hljs-keyword">return</span> <span class="hljs-string">`
        &lt;url&gt;
          &lt;loc&gt;<span class="hljs-subst">${url}</span>&lt;/loc&gt;
          &lt;changefreq&gt;daily&lt;/changefreq&gt;
          &lt;lastmod&gt;<span class="hljs-subst">${page.updatedAt}</span>&lt;/lastmod&gt;
          &lt;priority&gt;0.8&lt;/priority&gt;
        &lt;/url&gt;
      `</span>
  })
  <span class="hljs-keyword">return</span> pageItems.join(<span class="hljs-string">''</span>)
}
</code></pre>
<p>Here we're using the pages data to build a string containing the required format for each item in our sitemap page, ultimately we return all the items joined in a single string. This is the one that it's finally been inserted in our <strong>&lt;urlset&gt;</strong> tag.</p>
<p>Now that we have gone through the <strong>createSitemap</strong> function, let's start with <strong>getPages</strong>. For the sake of simplicity, I'm using an already defined Contentful client and importing it from the services/contentful file, <a target="_blank" href="https://github.com/contentful/contentful.js">click here</a> if you want to go deeper with the Contentful JS SDK implementation and how to initialize a client to consume the data from the CMS.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { client } <span class="hljs-keyword">from</span> <span class="hljs-string">'services/contentful'</span>

<span class="hljs-keyword">export</span> <span class="hljs-keyword">const</span> getPages = <span class="hljs-keyword">async</span> (): <span class="hljs-built_in">Promise</span>&lt;ContentfulPage[]&gt; =&gt; {
  <span class="hljs-keyword">const</span> collection = <span class="hljs-keyword">await</span> client.getEntries({
    <span class="hljs-string">'content_type'</span>: <span class="hljs-string">'page'</span>,
  })
  <span class="hljs-keyword">const</span> pages = collection?.items &amp;&amp; collection.items?.length ? collection.items : <span class="hljs-literal">null</span>

  <span class="hljs-keyword">if</span> (pages) <span class="hljs-keyword">return</span> pages.map(<span class="hljs-function">(<span class="hljs-params">page</span>) =&gt;</span> ({
    title: page.fields.title,
    slug: page.fields.slug,
    header: page.fields.header,
    blocks: page.fields.blocks,
    footer: page.fields.footer,
    updatedAt: page.sys.updatedAt,
  }))
  <span class="hljs-keyword">return</span> []
}
</code></pre>
<p>What we're basically doing is using the client to fetch entries that have the 'page' type using the <strong>getEntries</strong> client method. If there are any, we store the page items on the <strong>pages</strong> variable. Then we map each page to the <strong>ContentfulPage</strong> type we defined previously to use them in the <strong>createSitemap</strong> function.</p>
<p>Contentful gives us the data in the following format:</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"fields"</span>: {
     <span class="hljs-attr">"title"</span>: <span class="hljs-string">"string"</span>,
     <span class="hljs-attr">"slug"</span>: <span class="hljs-string">"string"</span>,
     <span class="hljs-attr">"header"</span>: { ... },
     ...
   },
  <span class="hljs-attr">"sys"</span>: {
    <span class="hljs-attr">"updatedAt"</span>: <span class="hljs-string">"2023-01-04T00:50:34.525Z"</span>,
    ...
  }
}
</code></pre>
<p>Being <strong>fields</strong> the entry fields themselves, like the title or slug, and the <strong>sys</strong> object where some metadata is defined like the updatedAt date.</p>
<p>And that's it! With this, we have fetched pages data from our CMS and created a sitemap.xml page for our headless site. The approach is kind of the same for other Headless CMS, the basic concept is that pages live as components in these CMS and we have to consume the data via API to build the page with our favorite language and technology. In this case, using JS and Next.js.</p>
<p>This was my first post so hope all of you like it (: any feedback will be well received.</p>
<p>I'd also like to know your thoughts, did you think this is the way these pages can be built? Have you used headless CMS before? What would have you done differently? (:</p>
<p>Last but not least, thanks for reading!</p>
]]></content:encoded></item></channel></rss>