<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>C on Greg Foletta - Bits and Blobs</title><link>https://clt.blog.foletta.net/categories/c/</link><description>Recent content in C on Greg Foletta - Bits and Blobs</description><generator>Hugo -- gohugo.io</generator><copyright>Copyright 2025 - Greg Foletta</copyright><lastBuildDate>Tue, 12 Nov 2024 00:00:00 +0000</lastBuildDate><atom:link href="https://clt.blog.foletta.net/categories/c/index.xml" rel="self" type="application/rss+xml"/><item><title>Simulating and Visualising the Central Limit Theorem</title><link>https://clt.blog.foletta.net/post/2025-07-14-clt/</link><pubDate>Wed, 13 Aug 2025 00:00:00 +0000</pubDate><guid>https://clt.blog.foletta.net/post/2025-07-14-clt/</guid><description>&lt;p>I completed a Computer Science degree at uni, and bundled a lot of maths subjects in as electives: partial differential equations, vector calculus, discrete maths, linear algebra. For some reason however I always avoided statistics subjects. Maybe there’s a story to be told about a young person finding uncertainty uncomfortable, because twenty years later I find statistics, particularly the Bayian flavour, really interesting.&lt;/p>
&lt;p>One problem with a self-directed journey is that there’s foundational knowledge that has come to me in dribs and drabs, and one of the most foundational is the &lt;em>Central Limit Theorem&lt;/em> (CLT). In this post I want to interrogate and explore the CLT using simulation and visualisation in an attempt to understand how it works in practice, not in theory. This is predominantly a process to help me better understand the CLT; you’re just here for the ride. Hopefully that ride can help you get where you need to go as well.&lt;/p>
&lt;p>It’s been awhile since I’ve included any code in a post, so where it makes sense I’ll show the generating R code, with a liberal sprinking of comments so it’s hopefully not too inscrutable.&lt;/p>
&lt;h1 id="a-brief-recap">A Brief Recap&lt;/h1>
&lt;p>I don’t want this to be like an online recipe with pages of back story before you get to the meat and bones, but a brief summary of the CLT before we begin is unavoidable. In plain English the CLT can be described as such:&lt;/p>
&lt;blockquote>
&lt;p>“If you take repeated samples of size &lt;em>n&lt;/em> from a distribution and calculate the sample mean for each, as &lt;em>n&lt;/em> gets approaches infinity, the distribution of sample means approaches a normal distribution.”&lt;/p>
&lt;/blockquote>
&lt;p>For the classic CLT there’s a couple of assumptions about the source distribution:&lt;/p>
&lt;ul>
&lt;li>The sample is drawn independently (no autocorrelation like in a time series).&lt;/li>
&lt;li>All the data points are drawn from the same distribution (“independent and identically distributed” or i.i.d).&lt;/li>
&lt;li>The distribution has a finite mean and variance (e.g. no Cauchy or Pareto distributions).&lt;/li>
&lt;/ul>
&lt;p>There are other versions of the CLT in which some of these assumptions are relaxed, but we’ll focus on the ‘classic’ version.&lt;/p>
&lt;p>Putting it math terms:&lt;/p>
&lt;p>$$
\frac{\bar{X}_n - \mu}{\sigma/\sqrt{n}} \overset{d}\longrightarrow \mathcal{N}(0, 1)
$$&lt;/p>
&lt;h1 id="simulating">Simulating&lt;/h1>
&lt;p>We’re not the kind of people to just accept something because someone threw some fancy Greek symbols at us. We want to simulate it to give ourselves some confidence it works in practice.&lt;/p>
&lt;p>Let’s create a tibble of ten-thousand random values from six different distributions, which we’ll call our ‘population’:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># 10,000 draws from six different distributions&lt;/span>
population_data &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">tibble&lt;/span>(
uniform &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">runif&lt;/span>(&lt;span style="color:#ae81ff">10000&lt;/span>, min &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">-20&lt;/span>, max &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">20&lt;/span>),
normal &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">rnorm&lt;/span>(&lt;span style="color:#ae81ff">10000&lt;/span>, mean &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span>, sd &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">4&lt;/span>),
binomial &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">rbinom&lt;/span>(&lt;span style="color:#ae81ff">10000&lt;/span>, size &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>, prob &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">.5&lt;/span>),
beta &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">rbeta&lt;/span>(&lt;span style="color:#ae81ff">10000&lt;/span>, shape1 &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">.9&lt;/span>, shape2 &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">.5&lt;/span>),
exponential &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">rexp&lt;/span>(&lt;span style="color:#ae81ff">10000&lt;/span>, &lt;span style="color:#ae81ff">.4&lt;/span>),
chisquare &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">rchisq&lt;/span>(&lt;span style="color:#ae81ff">10000&lt;/span>, df &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">2&lt;/span>),
)
&lt;/code>&lt;/pre>&lt;/div>&lt;div id="zkuvdcewpv" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
&lt;style>#zkuvdcewpv table {
font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
&amp;#10;#zkuvdcewpv thead, #zkuvdcewpv tbody, #zkuvdcewpv tfoot, #zkuvdcewpv tr, #zkuvdcewpv td, #zkuvdcewpv th {
border-style: none;
}
&amp;#10;#zkuvdcewpv p {
margin: 0;
padding: 0;
}
&amp;#10;#zkuvdcewpv .gt_table {
display: table;
border-collapse: collapse;
line-height: normal;
margin-left: auto;
margin-right: auto;
color: #333333;
font-size: 16px;
font-weight: normal;
font-style: normal;
background-color: #FFFFFF;
width: auto;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #A8A8A8;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #A8A8A8;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
}
&amp;#10;#zkuvdcewpv .gt_caption {
padding-top: 4px;
padding-bottom: 4px;
}
&amp;#10;#zkuvdcewpv .gt_title {
color: #333333;
font-size: 20px;
font-weight: bolder;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
border-bottom-color: #FFFFFF;
border-bottom-width: 0;
}
&amp;#10;#zkuvdcewpv .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
padding-top: 3px;
padding-bottom: 5px;
padding-left: 5px;
padding-right: 5px;
border-top-color: #FFFFFF;
border-top-width: 0;
}
&amp;#10;#zkuvdcewpv .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#zkuvdcewpv .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#zkuvdcewpv .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#zkuvdcewpv .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
padding-left: 5px;
padding-right: 5px;
overflow-x: hidden;
}
&amp;#10;#zkuvdcewpv .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
padding-top: 0;
padding-bottom: 0;
padding-left: 4px;
padding-right: 4px;
}
&amp;#10;#zkuvdcewpv .gt_column_spanner_outer:first-child {
padding-left: 0;
}
&amp;#10;#zkuvdcewpv .gt_column_spanner_outer:last-child {
padding-right: 0;
}
&amp;#10;#zkuvdcewpv .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 5px;
overflow-x: hidden;
display: inline-block;
width: 100%;
}
&amp;#10;#zkuvdcewpv .gt_spanner_row {
border-bottom-style: hidden;
}
&amp;#10;#zkuvdcewpv .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
text-align: left;
}
&amp;#10;#zkuvdcewpv .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: middle;
}
&amp;#10;#zkuvdcewpv .gt_from_md > :first-child {
margin-top: 0;
}
&amp;#10;#zkuvdcewpv .gt_from_md > :last-child {
margin-bottom: 0;
}
&amp;#10;#zkuvdcewpv .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
margin: 10px;
border-top-style: solid;
border-top-width: 1px;
border-top-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
overflow-x: hidden;
}
&amp;#10;#zkuvdcewpv .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#zkuvdcewpv .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
vertical-align: top;
}
&amp;#10;#zkuvdcewpv .gt_row_group_first td {
border-top-width: 2px;
}
&amp;#10;#zkuvdcewpv .gt_row_group_first th {
border-top-width: 2px;
}
&amp;#10;#zkuvdcewpv .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#zkuvdcewpv .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
&amp;#10;#zkuvdcewpv .gt_first_summary_row.thick {
border-top-width: 2px;
}
&amp;#10;#zkuvdcewpv .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#zkuvdcewpv .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#zkuvdcewpv .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: double;
border-top-width: 6px;
border-top-color: #D3D3D3;
}
&amp;#10;#zkuvdcewpv .gt_last_grand_summary_row_top {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: double;
border-bottom-width: 6px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#zkuvdcewpv .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
&amp;#10;#zkuvdcewpv .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#zkuvdcewpv .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#zkuvdcewpv .gt_footnote {
margin: 0px;
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#zkuvdcewpv .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#zkuvdcewpv .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#zkuvdcewpv .gt_left {
text-align: left;
}
&amp;#10;#zkuvdcewpv .gt_center {
text-align: center;
}
&amp;#10;#zkuvdcewpv .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
&amp;#10;#zkuvdcewpv .gt_font_normal {
font-weight: normal;
}
&amp;#10;#zkuvdcewpv .gt_font_bold {
font-weight: bold;
}
&amp;#10;#zkuvdcewpv .gt_font_italic {
font-style: italic;
}
&amp;#10;#zkuvdcewpv .gt_super {
font-size: 65%;
}
&amp;#10;#zkuvdcewpv .gt_footnote_marks {
font-size: 75%;
vertical-align: 0.4em;
position: initial;
}
&amp;#10;#zkuvdcewpv .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
&amp;#10;#zkuvdcewpv .gt_indent_1 {
text-indent: 5px;
}
&amp;#10;#zkuvdcewpv .gt_indent_2 {
text-indent: 10px;
}
&amp;#10;#zkuvdcewpv .gt_indent_3 {
text-indent: 15px;
}
&amp;#10;#zkuvdcewpv .gt_indent_4 {
text-indent: 20px;
}
&amp;#10;#zkuvdcewpv .gt_indent_5 {
text-indent: 25px;
}
&amp;#10;#zkuvdcewpv .katex-display {
display: inline-flex !important;
margin-bottom: 0.75em !important;
}
&amp;#10;#zkuvdcewpv div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
height: 0px !important;
}
&lt;/style>
&lt;table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
&lt;thead>
&lt;tr class="gt_heading">
&lt;td colspan="6" class="gt_heading gt_title gt_font_normal gt_bottom_border" style>Six 'Population' Distributions - Ten-Thousand Values - First Five Rows&lt;/td>
&lt;/tr>
&amp;#10; &lt;tr class="gt_col_headings">
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="uniform">uniform&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="normal">normal&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="binomial">binomial&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="beta">beta&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="exponential">exponential&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="chisquare">chisquare&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody class="gt_table_body">
&lt;tr>&lt;td headers="uniform" class="gt_row gt_right">-15.243338&lt;/td>
&lt;td headers="normal" class="gt_row gt_right">-0.7022216&lt;/td>
&lt;td headers="binomial" class="gt_row gt_right">0&lt;/td>
&lt;td headers="beta" class="gt_row gt_right">0.9473270&lt;/td>
&lt;td headers="exponential" class="gt_row gt_right">2.34764246&lt;/td>
&lt;td headers="chisquare" class="gt_row gt_right">2.8571390&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="uniform" class="gt_row gt_right">-9.788476&lt;/td>
&lt;td headers="normal" class="gt_row gt_right">-3.2383413&lt;/td>
&lt;td headers="binomial" class="gt_row gt_right">1&lt;/td>
&lt;td headers="beta" class="gt_row gt_right">0.0730441&lt;/td>
&lt;td headers="exponential" class="gt_row gt_right">6.00295709&lt;/td>
&lt;td headers="chisquare" class="gt_row gt_right">0.9272733&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="uniform" class="gt_row gt_right">-11.432831&lt;/td>
&lt;td headers="normal" class="gt_row gt_right">-2.0977844&lt;/td>
&lt;td headers="binomial" class="gt_row gt_right">0&lt;/td>
&lt;td headers="beta" class="gt_row gt_right">0.5160286&lt;/td>
&lt;td headers="exponential" class="gt_row gt_right">0.91033272&lt;/td>
&lt;td headers="chisquare" class="gt_row gt_right">3.1607328&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="uniform" class="gt_row gt_right">18.142428&lt;/td>
&lt;td headers="normal" class="gt_row gt_right">2.5360707&lt;/td>
&lt;td headers="binomial" class="gt_row gt_right">1&lt;/td>
&lt;td headers="beta" class="gt_row gt_right">0.9428468&lt;/td>
&lt;td headers="exponential" class="gt_row gt_right">3.08170008&lt;/td>
&lt;td headers="chisquare" class="gt_row gt_right">1.9489974&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="uniform" class="gt_row gt_right">-7.078033&lt;/td>
&lt;td headers="normal" class="gt_row gt_right">-1.7224620&lt;/td>
&lt;td headers="binomial" class="gt_row gt_right">0&lt;/td>
&lt;td headers="beta" class="gt_row gt_right">0.6134417&lt;/td>
&lt;td headers="exponential" class="gt_row gt_right">0.08687766&lt;/td>
&lt;td headers="chisquare" class="gt_row gt_right">2.5317882&lt;/td>&lt;/tr>
&lt;/tbody>
&amp;#10;
&lt;/table>
&lt;/div>
&lt;p>This ‘wide’ tibble is good for sampling from, but we’ll also transform it into a long version which will have other uses (note the ’_l’ suffix).&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># Long version of random data&lt;/span>
population_data_l &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
population_data &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">pivot_longer&lt;/span>(cols &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">everything&lt;/span>(), names_to &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;distribution&amp;#39;&lt;/span>)
&lt;/code>&lt;/pre>&lt;/div>&lt;div id="fuujtlarme" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
&lt;style>#fuujtlarme table {
font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
&amp;#10;#fuujtlarme thead, #fuujtlarme tbody, #fuujtlarme tfoot, #fuujtlarme tr, #fuujtlarme td, #fuujtlarme th {
border-style: none;
}
&amp;#10;#fuujtlarme p {
margin: 0;
padding: 0;
}
&amp;#10;#fuujtlarme .gt_table {
display: table;
border-collapse: collapse;
line-height: normal;
margin-left: auto;
margin-right: auto;
color: #333333;
font-size: 16px;
font-weight: normal;
font-style: normal;
background-color: #FFFFFF;
width: auto;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #A8A8A8;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #A8A8A8;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
}
&amp;#10;#fuujtlarme .gt_caption {
padding-top: 4px;
padding-bottom: 4px;
}
&amp;#10;#fuujtlarme .gt_title {
color: #333333;
font-size: 20px;
font-weight: bolder;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
border-bottom-color: #FFFFFF;
border-bottom-width: 0;
}
&amp;#10;#fuujtlarme .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
padding-top: 3px;
padding-bottom: 5px;
padding-left: 5px;
padding-right: 5px;
border-top-color: #FFFFFF;
border-top-width: 0;
}
&amp;#10;#fuujtlarme .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#fuujtlarme .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#fuujtlarme .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#fuujtlarme .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
padding-left: 5px;
padding-right: 5px;
overflow-x: hidden;
}
&amp;#10;#fuujtlarme .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
padding-top: 0;
padding-bottom: 0;
padding-left: 4px;
padding-right: 4px;
}
&amp;#10;#fuujtlarme .gt_column_spanner_outer:first-child {
padding-left: 0;
}
&amp;#10;#fuujtlarme .gt_column_spanner_outer:last-child {
padding-right: 0;
}
&amp;#10;#fuujtlarme .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 5px;
overflow-x: hidden;
display: inline-block;
width: 100%;
}
&amp;#10;#fuujtlarme .gt_spanner_row {
border-bottom-style: hidden;
}
&amp;#10;#fuujtlarme .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
text-align: left;
}
&amp;#10;#fuujtlarme .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: middle;
}
&amp;#10;#fuujtlarme .gt_from_md > :first-child {
margin-top: 0;
}
&amp;#10;#fuujtlarme .gt_from_md > :last-child {
margin-bottom: 0;
}
&amp;#10;#fuujtlarme .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
margin: 10px;
border-top-style: solid;
border-top-width: 1px;
border-top-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
overflow-x: hidden;
}
&amp;#10;#fuujtlarme .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#fuujtlarme .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
vertical-align: top;
}
&amp;#10;#fuujtlarme .gt_row_group_first td {
border-top-width: 2px;
}
&amp;#10;#fuujtlarme .gt_row_group_first th {
border-top-width: 2px;
}
&amp;#10;#fuujtlarme .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#fuujtlarme .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
&amp;#10;#fuujtlarme .gt_first_summary_row.thick {
border-top-width: 2px;
}
&amp;#10;#fuujtlarme .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#fuujtlarme .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#fuujtlarme .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: double;
border-top-width: 6px;
border-top-color: #D3D3D3;
}
&amp;#10;#fuujtlarme .gt_last_grand_summary_row_top {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: double;
border-bottom-width: 6px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#fuujtlarme .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
&amp;#10;#fuujtlarme .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#fuujtlarme .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#fuujtlarme .gt_footnote {
margin: 0px;
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#fuujtlarme .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#fuujtlarme .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#fuujtlarme .gt_left {
text-align: left;
}
&amp;#10;#fuujtlarme .gt_center {
text-align: center;
}
&amp;#10;#fuujtlarme .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
&amp;#10;#fuujtlarme .gt_font_normal {
font-weight: normal;
}
&amp;#10;#fuujtlarme .gt_font_bold {
font-weight: bold;
}
&amp;#10;#fuujtlarme .gt_font_italic {
font-style: italic;
}
&amp;#10;#fuujtlarme .gt_super {
font-size: 65%;
}
&amp;#10;#fuujtlarme .gt_footnote_marks {
font-size: 75%;
vertical-align: 0.4em;
position: initial;
}
&amp;#10;#fuujtlarme .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
&amp;#10;#fuujtlarme .gt_indent_1 {
text-indent: 5px;
}
&amp;#10;#fuujtlarme .gt_indent_2 {
text-indent: 10px;
}
&amp;#10;#fuujtlarme .gt_indent_3 {
text-indent: 15px;
}
&amp;#10;#fuujtlarme .gt_indent_4 {
text-indent: 20px;
}
&amp;#10;#fuujtlarme .gt_indent_5 {
text-indent: 25px;
}
&amp;#10;#fuujtlarme .katex-display {
display: inline-flex !important;
margin-bottom: 0.75em !important;
}
&amp;#10;#fuujtlarme div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
height: 0px !important;
}
&lt;/style>
&lt;table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
&lt;thead>
&lt;tr class="gt_heading">
&lt;td colspan="2" class="gt_heading gt_title gt_font_normal gt_bottom_border" style>Six Distributions - Post 'pivot_longer() - First Value of Each&lt;/td>
&lt;/tr>
&amp;#10; &lt;tr class="gt_col_headings">
&lt;th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1" scope="col" id="distribution">distribution&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="value">value&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody class="gt_table_body">
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">beta&lt;/td>
&lt;td headers="value" class="gt_row gt_right">0.9473270&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">binomial&lt;/td>
&lt;td headers="value" class="gt_row gt_right">0.0000000&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">chisquare&lt;/td>
&lt;td headers="value" class="gt_row gt_right">2.8571390&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">exponential&lt;/td>
&lt;td headers="value" class="gt_row gt_right">2.3476425&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">normal&lt;/td>
&lt;td headers="value" class="gt_row gt_right">-0.7022216&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">uniform&lt;/td>
&lt;td headers="value" class="gt_row gt_right">-15.2433376&lt;/td>&lt;/tr>
&lt;/tbody>
&amp;#10;
&lt;/table>
&lt;/div>
&lt;p>Here’s a histogram of each of the population distributions:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2025-07-14-clt/index_files/figure-html/unnamed-chunk-6-1.png" width="672" />&lt;/p>
&lt;p>Let’s define a function &lt;code>take_random_sample_mean()&lt;/code> which takes a sample from all of the population distributions and calculates the mean. If we use this function repeatedly, we should end up with a data set that demonstrates the central limit theorem.&lt;/p>
&lt;p>Let’s take 20,000 sample means of size 60, bind it all together into a single data frame, and shape it into a long version.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># Define a function to take a random sample from our data&lt;/span>
take_random_sample_mean &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">function&lt;/span>(data, sample_size) {
&lt;span style="color:#a6e22e">slice_sample&lt;/span>(.data &lt;span style="color:#f92672">=&lt;/span> data, n &lt;span style="color:#f92672">=&lt;/span> sample_size) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">summarise&lt;/span>(&lt;span style="color:#a6e22e">across&lt;/span>(&lt;span style="color:#a6e22e">everything&lt;/span>(), &lt;span style="color:#a6e22e">list&lt;/span>(sample_mean &lt;span style="color:#f92672">=&lt;/span> mean, sample_sd &lt;span style="color:#f92672">=&lt;/span> sd)))
}
&lt;span style="color:#75715e"># Draw 20,000 means of size 60 from our random data&lt;/span>
sample_size &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#ae81ff">60&lt;/span>
sample_means &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
&lt;span style="color:#a6e22e">map&lt;/span>(&lt;span style="color:#ae81ff">1&lt;/span>&lt;span style="color:#f92672">:&lt;/span>&lt;span style="color:#ae81ff">20000&lt;/span>, &lt;span style="color:#f92672">~&lt;/span>&lt;span style="color:#a6e22e">take_random_sample_mean&lt;/span>(population_data, sample_size &lt;span style="color:#f92672">=&lt;/span> sample_size)) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#75715e"># Bind the sample means into a single tibble&lt;/span>
&lt;span style="color:#a6e22e">list_rbind&lt;/span>() &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#75715e"># Move to a long version of the data&lt;/span>
&lt;span style="color:#a6e22e">pivot_longer&lt;/span>(
&lt;span style="color:#a6e22e">everything&lt;/span>(),
names_to &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">c&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;distribution&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;.value&amp;#39;&lt;/span>),
names_pattern &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;(\\w+)_(\\w+_\\w+)&amp;#39;&lt;/span>
)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Here’s the resulting histograms of the samples with only the x-axis free to change scale. The beta and binomial look pretty normal, and the others might be as well, but the differences in variance make it difficult to tell.&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2025-07-14-clt/index_files/figure-html/unnamed-chunk-8-1.png" width="672" />&lt;/p>
&lt;p>If you recall the formula at the start, to get to a standard normal we also need to subtract the population mean and divide by the population standard deviation over the square-root of n. Using the long version of the population data we can calculate the population mean and SD statistics:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">population_data_stats &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
population_data_l &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">group_by&lt;/span>(distribution) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">summarise&lt;/span>(
population_mean &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">mean&lt;/span>(value),
population_sd &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">sd&lt;/span>(value),
n &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">n&lt;/span>()
)
&lt;/code>&lt;/pre>&lt;/div>&lt;div id="lpeuntixij" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
&lt;style>#lpeuntixij table {
font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
&amp;#10;#lpeuntixij thead, #lpeuntixij tbody, #lpeuntixij tfoot, #lpeuntixij tr, #lpeuntixij td, #lpeuntixij th {
border-style: none;
}
&amp;#10;#lpeuntixij p {
margin: 0;
padding: 0;
}
&amp;#10;#lpeuntixij .gt_table {
display: table;
border-collapse: collapse;
line-height: normal;
margin-left: auto;
margin-right: auto;
color: #333333;
font-size: 16px;
font-weight: normal;
font-style: normal;
background-color: #FFFFFF;
width: auto;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #A8A8A8;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #A8A8A8;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
}
&amp;#10;#lpeuntixij .gt_caption {
padding-top: 4px;
padding-bottom: 4px;
}
&amp;#10;#lpeuntixij .gt_title {
color: #333333;
font-size: 125%;
font-weight: initial;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
border-bottom-color: #FFFFFF;
border-bottom-width: 0;
}
&amp;#10;#lpeuntixij .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
padding-top: 3px;
padding-bottom: 5px;
padding-left: 5px;
padding-right: 5px;
border-top-color: #FFFFFF;
border-top-width: 0;
}
&amp;#10;#lpeuntixij .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#lpeuntixij .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#lpeuntixij .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#lpeuntixij .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
padding-left: 5px;
padding-right: 5px;
overflow-x: hidden;
}
&amp;#10;#lpeuntixij .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
padding-top: 0;
padding-bottom: 0;
padding-left: 4px;
padding-right: 4px;
}
&amp;#10;#lpeuntixij .gt_column_spanner_outer:first-child {
padding-left: 0;
}
&amp;#10;#lpeuntixij .gt_column_spanner_outer:last-child {
padding-right: 0;
}
&amp;#10;#lpeuntixij .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 5px;
overflow-x: hidden;
display: inline-block;
width: 100%;
}
&amp;#10;#lpeuntixij .gt_spanner_row {
border-bottom-style: hidden;
}
&amp;#10;#lpeuntixij .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
text-align: left;
}
&amp;#10;#lpeuntixij .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: middle;
}
&amp;#10;#lpeuntixij .gt_from_md > :first-child {
margin-top: 0;
}
&amp;#10;#lpeuntixij .gt_from_md > :last-child {
margin-bottom: 0;
}
&amp;#10;#lpeuntixij .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
margin: 10px;
border-top-style: solid;
border-top-width: 1px;
border-top-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
overflow-x: hidden;
}
&amp;#10;#lpeuntixij .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#lpeuntixij .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
vertical-align: top;
}
&amp;#10;#lpeuntixij .gt_row_group_first td {
border-top-width: 2px;
}
&amp;#10;#lpeuntixij .gt_row_group_first th {
border-top-width: 2px;
}
&amp;#10;#lpeuntixij .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#lpeuntixij .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
&amp;#10;#lpeuntixij .gt_first_summary_row.thick {
border-top-width: 2px;
}
&amp;#10;#lpeuntixij .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#lpeuntixij .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#lpeuntixij .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: double;
border-top-width: 6px;
border-top-color: #D3D3D3;
}
&amp;#10;#lpeuntixij .gt_last_grand_summary_row_top {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: double;
border-bottom-width: 6px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#lpeuntixij .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
&amp;#10;#lpeuntixij .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#lpeuntixij .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#lpeuntixij .gt_footnote {
margin: 0px;
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#lpeuntixij .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#lpeuntixij .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#lpeuntixij .gt_left {
text-align: left;
}
&amp;#10;#lpeuntixij .gt_center {
text-align: center;
}
&amp;#10;#lpeuntixij .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
&amp;#10;#lpeuntixij .gt_font_normal {
font-weight: normal;
}
&amp;#10;#lpeuntixij .gt_font_bold {
font-weight: bold;
}
&amp;#10;#lpeuntixij .gt_font_italic {
font-style: italic;
}
&amp;#10;#lpeuntixij .gt_super {
font-size: 65%;
}
&amp;#10;#lpeuntixij .gt_footnote_marks {
font-size: 75%;
vertical-align: 0.4em;
position: initial;
}
&amp;#10;#lpeuntixij .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
&amp;#10;#lpeuntixij .gt_indent_1 {
text-indent: 5px;
}
&amp;#10;#lpeuntixij .gt_indent_2 {
text-indent: 10px;
}
&amp;#10;#lpeuntixij .gt_indent_3 {
text-indent: 15px;
}
&amp;#10;#lpeuntixij .gt_indent_4 {
text-indent: 20px;
}
&amp;#10;#lpeuntixij .gt_indent_5 {
text-indent: 25px;
}
&amp;#10;#lpeuntixij .katex-display {
display: inline-flex !important;
margin-bottom: 0.75em !important;
}
&amp;#10;#lpeuntixij div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
height: 0px !important;
}
&lt;/style>
&lt;table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
&lt;thead>
&lt;tr class="gt_heading">
&lt;td colspan="4" class="gt_heading gt_title gt_font_normal gt_bottom_border" style>Random Means - Statistics&lt;/td>
&lt;/tr>
&amp;#10; &lt;tr class="gt_col_headings">
&lt;th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1" scope="col" id="distribution">distribution&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="population_mean">population_mean&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="population_sd">population_sd&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="n">n&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody class="gt_table_body">
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">beta&lt;/td>
&lt;td headers="population_mean" class="gt_row gt_right">0.64352237&lt;/td>
&lt;td headers="population_sd" class="gt_row gt_right">0.3105243&lt;/td>
&lt;td headers="n" class="gt_row gt_right">10000&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">binomial&lt;/td>
&lt;td headers="population_mean" class="gt_row gt_right">0.50260000&lt;/td>
&lt;td headers="population_sd" class="gt_row gt_right">0.5000182&lt;/td>
&lt;td headers="n" class="gt_row gt_right">10000&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">chisquare&lt;/td>
&lt;td headers="population_mean" class="gt_row gt_right">2.03125146&lt;/td>
&lt;td headers="population_sd" class="gt_row gt_right">2.0189741&lt;/td>
&lt;td headers="n" class="gt_row gt_right">10000&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">exponential&lt;/td>
&lt;td headers="population_mean" class="gt_row gt_right">2.48144359&lt;/td>
&lt;td headers="population_sd" class="gt_row gt_right">2.5049145&lt;/td>
&lt;td headers="n" class="gt_row gt_right">10000&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">normal&lt;/td>
&lt;td headers="population_mean" class="gt_row gt_right">-0.03205698&lt;/td>
&lt;td headers="population_sd" class="gt_row gt_right">3.9813502&lt;/td>
&lt;td headers="n" class="gt_row gt_right">10000&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">uniform&lt;/td>
&lt;td headers="population_mean" class="gt_row gt_right">-0.09148261&lt;/td>
&lt;td headers="population_sd" class="gt_row gt_right">11.6405771&lt;/td>
&lt;td headers="n" class="gt_row gt_right">10000&lt;/td>&lt;/tr>
&lt;/tbody>
&amp;#10;
&lt;/table>
&lt;/div>
&lt;p>Joining each of those statistics into the sample means by their distribution allows us to nicely standardise.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># CLT Calculation&lt;/span>
clt &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
sample_means &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#75715e"># Join in the popultion mean, sd, and n&lt;/span>
&lt;span style="color:#a6e22e">left_join&lt;/span>(population_data_stats, by &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;distribution&amp;#39;&lt;/span>) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#75715e"># Scale to the standard normal&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(
clt &lt;span style="color:#f92672">=&lt;/span> (sample_mean &lt;span style="color:#f92672">-&lt;/span> population_mean) &lt;span style="color:#f92672">/&lt;/span> (population_sd &lt;span style="color:#f92672">/&lt;/span> &lt;span style="color:#a6e22e">sqrt&lt;/span>(sample_size) )
)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2025-07-14-clt/index_files/figure-html/unnamed-chunk-12-1.png" width="672" />&lt;/p>
&lt;p>That’s better: they now at least all look the same, except for the binomial which ends up having higher counts because of its discrete rather than continuous values. While you can &lt;em>kind of&lt;/em> guess that they’re normal from a histogram, we can get a better sense quantile-quantile plot. The standard normal quantiles are on the x-axis, and our sample mean quantiles are on the y-axis.&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2025-07-14-clt/index_files/figure-html/unnamed-chunk-13-1.png" width="672" />
They all track pretty closely to a standard normal, however the chi-square and exponential do tend to diverge slightly at the ends. We’ll dive a bit deeper into that later in the post.&lt;/p>
&lt;p>Let’s summarise: we took twenty-thousand sample means of size sixty from six wildly different distributions, and we were able to see that the distributions of these sample means approximately followed a normal distribution. This is the essence of the central limit theorem, which allows us to use statistical methods for normal distributions on problems that involve population distributions that aren’t normal.&lt;/p>
&lt;h1 id="in-practice-with-mistakes">In Practice, With Mistakes&lt;/h1>
&lt;p>That’s all well and good if you’ve got the population mean and standard deviation, but in most cases you’re not going to have that. You’re also likely not going to have the resources to take twenty-thousand different samples.&lt;/p>
&lt;p>Let’s take a a pretty classic example of interviewing six people about about something (weight, height, voting intentions, etc) and take the average: your sample mean. What the CLT can give you is an interval around your sample mean that would give you the a confidence interval (we’ll use the classic 95%) that the &lt;em>true mean&lt;/em> of the population is somewhere in the interval.&lt;/p>
&lt;p>So you use a bit of algebra and move some terms around in the CLT formula and you get this:&lt;/p>
&lt;p>$$
\bar{X} \pm z_.025 \cdot \ \frac{s}{\sqrt{n}}&lt;br>
$$&lt;/p>
&lt;p>where \(\bar{X}\) is your sample mean, \(z_.025\) is the critical value (aka &lt;code>qnorm()&lt;/code> in R) and \(s\) is the sample standard deviation. The sample standard deviation over the square root n is more commonly known as the standard error.&lt;/p>
&lt;p>More simply, we take the sample mean and add/subtract the 97.5 quantile from a normal distribution times the standard error to get our 95% interval. Some may have noted a small mistake I’ve made here, which I’’ll leave in but will soon come to light.&lt;/p>
&lt;p>We’re in frequentist territory here, so while it’s tempting to say “there’s a 95% probability that the true mean is in the interval”, we shouldn’t. Why? Because to a frequentist, the true mean \(\mu\) is a fixed value: it’s either in the interval or it’s not, we can’t assign a probability to the population mean. What we should say is that if we were to repeat the sampling process many times, in the long run we should see the true mean within this confidence interval 95% of the time. We can simulate this to see if that is true.&lt;/p>
&lt;p>We use a sample size of 6, taking the mean of these 6 samples from each of our population distributions 10,000 times.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># Ten-thousand random sample means of size six&lt;/span>
small_sample_size &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#ae81ff">6&lt;/span>
repeated_samples &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
&lt;span style="color:#a6e22e">map&lt;/span>(&lt;span style="color:#ae81ff">1&lt;/span>&lt;span style="color:#f92672">:&lt;/span>&lt;span style="color:#ae81ff">10000&lt;/span>, &lt;span style="color:#f92672">~&lt;/span>&lt;span style="color:#a6e22e">take_random_sample_mean&lt;/span>(population_data, sample_size &lt;span style="color:#f92672">=&lt;/span> small_sample_size)) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#75715e"># Bind the samples into a single tibble&lt;/span>
&lt;span style="color:#a6e22e">list_rbind&lt;/span>() &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#75715e"># Wide to long tibble&lt;/span>
&lt;span style="color:#a6e22e">pivot_longer&lt;/span>(
&lt;span style="color:#a6e22e">everything&lt;/span>(),
names_to &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">c&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;distribution&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;.value&amp;#39;&lt;/span>),
names_pattern &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;(\\w+)_(\\w+_\\w+)&amp;#39;&lt;/span>
) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#75715e"># Join with our population mean &lt;/span>
&lt;span style="color:#a6e22e">left_join&lt;/span>(population_data_stats, by &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;distribution&amp;#39;&lt;/span>)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>For each of the samples from the distributions we calculate the 95% confidence interval. Because we know the true population mean, we can determine whether it is or is not within the interval. Then we calculate the percentage of samples for which the population mean fell within the 95% interval.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">conf_intervals &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
repeated_samples &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#75715e"># Calculate CIs and whether CI contains the true population mean&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(
upper_ci &lt;span style="color:#f92672">=&lt;/span> sample_mean &lt;span style="color:#f92672">+&lt;/span> &lt;span style="color:#a6e22e">qnorm&lt;/span>(&lt;span style="color:#ae81ff">0.975&lt;/span>) &lt;span style="color:#f92672">*&lt;/span> sample_sd &lt;span style="color:#f92672">/&lt;/span> &lt;span style="color:#a6e22e">sqrt&lt;/span>(small_sample_size),
lower_ci &lt;span style="color:#f92672">=&lt;/span> sample_mean &lt;span style="color:#f92672">-&lt;/span> &lt;span style="color:#a6e22e">qnorm&lt;/span>(&lt;span style="color:#ae81ff">0.975&lt;/span>) &lt;span style="color:#f92672">*&lt;/span> sample_sd &lt;span style="color:#f92672">/&lt;/span> &lt;span style="color:#a6e22e">sqrt&lt;/span>(small_sample_size),
within_ci &lt;span style="color:#f92672">=&lt;/span> population_mean &lt;span style="color:#f92672">&amp;lt;&lt;/span> upper_ci &lt;span style="color:#f92672">&amp;amp;&lt;/span> population_mean &lt;span style="color:#f92672">&amp;gt;&lt;/span> lower_ci
) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#75715e"># Determine percentage of CI that contain the true mean&lt;/span>
&lt;span style="color:#a6e22e">group_by&lt;/span>(distribution) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">summarise&lt;/span>(percent_within_ci &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">mean&lt;/span>(within_ci))
&lt;/code>&lt;/pre>&lt;/div>&lt;div id="mzhuiwavnp" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
&lt;style>#mzhuiwavnp table {
font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
&amp;#10;#mzhuiwavnp thead, #mzhuiwavnp tbody, #mzhuiwavnp tfoot, #mzhuiwavnp tr, #mzhuiwavnp td, #mzhuiwavnp th {
border-style: none;
}
&amp;#10;#mzhuiwavnp p {
margin: 0;
padding: 0;
}
&amp;#10;#mzhuiwavnp .gt_table {
display: table;
border-collapse: collapse;
line-height: normal;
margin-left: auto;
margin-right: auto;
color: #333333;
font-size: 16px;
font-weight: normal;
font-style: normal;
background-color: #FFFFFF;
width: auto;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #A8A8A8;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #A8A8A8;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
}
&amp;#10;#mzhuiwavnp .gt_caption {
padding-top: 4px;
padding-bottom: 4px;
}
&amp;#10;#mzhuiwavnp .gt_title {
color: #333333;
font-size: 125%;
font-weight: initial;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
border-bottom-color: #FFFFFF;
border-bottom-width: 0;
}
&amp;#10;#mzhuiwavnp .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
padding-top: 3px;
padding-bottom: 5px;
padding-left: 5px;
padding-right: 5px;
border-top-color: #FFFFFF;
border-top-width: 0;
}
&amp;#10;#mzhuiwavnp .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#mzhuiwavnp .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#mzhuiwavnp .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#mzhuiwavnp .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
padding-left: 5px;
padding-right: 5px;
overflow-x: hidden;
}
&amp;#10;#mzhuiwavnp .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
padding-top: 0;
padding-bottom: 0;
padding-left: 4px;
padding-right: 4px;
}
&amp;#10;#mzhuiwavnp .gt_column_spanner_outer:first-child {
padding-left: 0;
}
&amp;#10;#mzhuiwavnp .gt_column_spanner_outer:last-child {
padding-right: 0;
}
&amp;#10;#mzhuiwavnp .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 5px;
overflow-x: hidden;
display: inline-block;
width: 100%;
}
&amp;#10;#mzhuiwavnp .gt_spanner_row {
border-bottom-style: hidden;
}
&amp;#10;#mzhuiwavnp .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
text-align: left;
}
&amp;#10;#mzhuiwavnp .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: middle;
}
&amp;#10;#mzhuiwavnp .gt_from_md > :first-child {
margin-top: 0;
}
&amp;#10;#mzhuiwavnp .gt_from_md > :last-child {
margin-bottom: 0;
}
&amp;#10;#mzhuiwavnp .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
margin: 10px;
border-top-style: solid;
border-top-width: 1px;
border-top-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
overflow-x: hidden;
}
&amp;#10;#mzhuiwavnp .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#mzhuiwavnp .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
vertical-align: top;
}
&amp;#10;#mzhuiwavnp .gt_row_group_first td {
border-top-width: 2px;
}
&amp;#10;#mzhuiwavnp .gt_row_group_first th {
border-top-width: 2px;
}
&amp;#10;#mzhuiwavnp .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#mzhuiwavnp .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
&amp;#10;#mzhuiwavnp .gt_first_summary_row.thick {
border-top-width: 2px;
}
&amp;#10;#mzhuiwavnp .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#mzhuiwavnp .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#mzhuiwavnp .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: double;
border-top-width: 6px;
border-top-color: #D3D3D3;
}
&amp;#10;#mzhuiwavnp .gt_last_grand_summary_row_top {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: double;
border-bottom-width: 6px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#mzhuiwavnp .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
&amp;#10;#mzhuiwavnp .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#mzhuiwavnp .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#mzhuiwavnp .gt_footnote {
margin: 0px;
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#mzhuiwavnp .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#mzhuiwavnp .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#mzhuiwavnp .gt_left {
text-align: left;
}
&amp;#10;#mzhuiwavnp .gt_center {
text-align: center;
}
&amp;#10;#mzhuiwavnp .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
&amp;#10;#mzhuiwavnp .gt_font_normal {
font-weight: normal;
}
&amp;#10;#mzhuiwavnp .gt_font_bold {
font-weight: bold;
}
&amp;#10;#mzhuiwavnp .gt_font_italic {
font-style: italic;
}
&amp;#10;#mzhuiwavnp .gt_super {
font-size: 65%;
}
&amp;#10;#mzhuiwavnp .gt_footnote_marks {
font-size: 75%;
vertical-align: 0.4em;
position: initial;
}
&amp;#10;#mzhuiwavnp .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
&amp;#10;#mzhuiwavnp .gt_indent_1 {
text-indent: 5px;
}
&amp;#10;#mzhuiwavnp .gt_indent_2 {
text-indent: 10px;
}
&amp;#10;#mzhuiwavnp .gt_indent_3 {
text-indent: 15px;
}
&amp;#10;#mzhuiwavnp .gt_indent_4 {
text-indent: 20px;
}
&amp;#10;#mzhuiwavnp .gt_indent_5 {
text-indent: 25px;
}
&amp;#10;#mzhuiwavnp .katex-display {
display: inline-flex !important;
margin-bottom: 0.75em !important;
}
&amp;#10;#mzhuiwavnp div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
height: 0px !important;
}
&lt;/style>
&lt;table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
&lt;thead>
&lt;tr class="gt_col_headings">
&lt;th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1" scope="col" id="distribution">Distribution&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="percent_within_ci">% Within CI&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody class="gt_table_body">
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">normal&lt;/td>
&lt;td headers="percent_within_ci" class="gt_row gt_right">89.36%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">uniform&lt;/td>
&lt;td headers="percent_within_ci" class="gt_row gt_right">88.95%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">beta&lt;/td>
&lt;td headers="percent_within_ci" class="gt_row gt_right">87.87%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">chisquare&lt;/td>
&lt;td headers="percent_within_ci" class="gt_row gt_right">83.32%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">exponential&lt;/td>
&lt;td headers="percent_within_ci" class="gt_row gt_right">82.87%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">binomial&lt;/td>
&lt;td headers="percent_within_ci" class="gt_row gt_right">77.90%&lt;/td>&lt;/tr>
&lt;/tbody>
&amp;#10;
&lt;/table>
&lt;/div>
&lt;p>Uh oh! We’re way off our 95% here, what happened?&lt;/p>
&lt;p>This is the mistake I was talking about earlier: we used a normal to calculate the CIs. We estimated the population standard deviation \(\sigma\) using our sample standard deviation \(s\). With our small sample size of 6, our CLT formula follows a t-distribution, not a normal.&lt;/p>
&lt;p>We’ll re-run our confidence intervals, but this time we use &lt;code>qt()&lt;/code>, the t-distribution quantile function instead of &lt;code>qnorm()&lt;/code>.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">conf_intervals &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
repeated_samples &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(
upper_ci &lt;span style="color:#f92672">=&lt;/span> sample_mean &lt;span style="color:#f92672">+&lt;/span> &lt;span style="color:#a6e22e">qt&lt;/span>(p &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">0.975&lt;/span>, df &lt;span style="color:#f92672">=&lt;/span> small_sample_size &lt;span style="color:#f92672">-&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>) &lt;span style="color:#f92672">*&lt;/span> (sample_sd &lt;span style="color:#f92672">/&lt;/span> &lt;span style="color:#a6e22e">sqrt&lt;/span>(small_sample_size)),
lower_ci &lt;span style="color:#f92672">=&lt;/span> sample_mean &lt;span style="color:#f92672">-&lt;/span> &lt;span style="color:#a6e22e">qt&lt;/span>(p &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">0.975&lt;/span>, df &lt;span style="color:#f92672">=&lt;/span> small_sample_size &lt;span style="color:#f92672">-&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>) &lt;span style="color:#f92672">*&lt;/span> (sample_sd &lt;span style="color:#f92672">/&lt;/span> &lt;span style="color:#a6e22e">sqrt&lt;/span>(small_sample_size)),
within_ci &lt;span style="color:#f92672">=&lt;/span> population_mean &lt;span style="color:#f92672">&amp;lt;&lt;/span> upper_ci &lt;span style="color:#f92672">&amp;amp;&lt;/span> population_mean &lt;span style="color:#f92672">&amp;gt;&lt;/span> lower_ci
) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">group_by&lt;/span>(distribution) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">summarise&lt;/span>(percent_within_ci &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">mean&lt;/span>(within_ci))
&lt;/code>&lt;/pre>&lt;/div>&lt;div id="gbaweccfdi" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
&lt;style>#gbaweccfdi table {
font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
&amp;#10;#gbaweccfdi thead, #gbaweccfdi tbody, #gbaweccfdi tfoot, #gbaweccfdi tr, #gbaweccfdi td, #gbaweccfdi th {
border-style: none;
}
&amp;#10;#gbaweccfdi p {
margin: 0;
padding: 0;
}
&amp;#10;#gbaweccfdi .gt_table {
display: table;
border-collapse: collapse;
line-height: normal;
margin-left: auto;
margin-right: auto;
color: #333333;
font-size: 16px;
font-weight: normal;
font-style: normal;
background-color: #FFFFFF;
width: auto;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #A8A8A8;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #A8A8A8;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
}
&amp;#10;#gbaweccfdi .gt_caption {
padding-top: 4px;
padding-bottom: 4px;
}
&amp;#10;#gbaweccfdi .gt_title {
color: #333333;
font-size: 125%;
font-weight: initial;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
border-bottom-color: #FFFFFF;
border-bottom-width: 0;
}
&amp;#10;#gbaweccfdi .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
padding-top: 3px;
padding-bottom: 5px;
padding-left: 5px;
padding-right: 5px;
border-top-color: #FFFFFF;
border-top-width: 0;
}
&amp;#10;#gbaweccfdi .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#gbaweccfdi .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#gbaweccfdi .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#gbaweccfdi .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
padding-left: 5px;
padding-right: 5px;
overflow-x: hidden;
}
&amp;#10;#gbaweccfdi .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
padding-top: 0;
padding-bottom: 0;
padding-left: 4px;
padding-right: 4px;
}
&amp;#10;#gbaweccfdi .gt_column_spanner_outer:first-child {
padding-left: 0;
}
&amp;#10;#gbaweccfdi .gt_column_spanner_outer:last-child {
padding-right: 0;
}
&amp;#10;#gbaweccfdi .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 5px;
overflow-x: hidden;
display: inline-block;
width: 100%;
}
&amp;#10;#gbaweccfdi .gt_spanner_row {
border-bottom-style: hidden;
}
&amp;#10;#gbaweccfdi .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
text-align: left;
}
&amp;#10;#gbaweccfdi .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: middle;
}
&amp;#10;#gbaweccfdi .gt_from_md > :first-child {
margin-top: 0;
}
&amp;#10;#gbaweccfdi .gt_from_md > :last-child {
margin-bottom: 0;
}
&amp;#10;#gbaweccfdi .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
margin: 10px;
border-top-style: solid;
border-top-width: 1px;
border-top-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
overflow-x: hidden;
}
&amp;#10;#gbaweccfdi .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#gbaweccfdi .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
vertical-align: top;
}
&amp;#10;#gbaweccfdi .gt_row_group_first td {
border-top-width: 2px;
}
&amp;#10;#gbaweccfdi .gt_row_group_first th {
border-top-width: 2px;
}
&amp;#10;#gbaweccfdi .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#gbaweccfdi .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
&amp;#10;#gbaweccfdi .gt_first_summary_row.thick {
border-top-width: 2px;
}
&amp;#10;#gbaweccfdi .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#gbaweccfdi .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#gbaweccfdi .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: double;
border-top-width: 6px;
border-top-color: #D3D3D3;
}
&amp;#10;#gbaweccfdi .gt_last_grand_summary_row_top {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: double;
border-bottom-width: 6px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#gbaweccfdi .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
&amp;#10;#gbaweccfdi .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#gbaweccfdi .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#gbaweccfdi .gt_footnote {
margin: 0px;
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#gbaweccfdi .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#gbaweccfdi .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#gbaweccfdi .gt_left {
text-align: left;
}
&amp;#10;#gbaweccfdi .gt_center {
text-align: center;
}
&amp;#10;#gbaweccfdi .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
&amp;#10;#gbaweccfdi .gt_font_normal {
font-weight: normal;
}
&amp;#10;#gbaweccfdi .gt_font_bold {
font-weight: bold;
}
&amp;#10;#gbaweccfdi .gt_font_italic {
font-style: italic;
}
&amp;#10;#gbaweccfdi .gt_super {
font-size: 65%;
}
&amp;#10;#gbaweccfdi .gt_footnote_marks {
font-size: 75%;
vertical-align: 0.4em;
position: initial;
}
&amp;#10;#gbaweccfdi .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
&amp;#10;#gbaweccfdi .gt_indent_1 {
text-indent: 5px;
}
&amp;#10;#gbaweccfdi .gt_indent_2 {
text-indent: 10px;
}
&amp;#10;#gbaweccfdi .gt_indent_3 {
text-indent: 15px;
}
&amp;#10;#gbaweccfdi .gt_indent_4 {
text-indent: 20px;
}
&amp;#10;#gbaweccfdi .gt_indent_5 {
text-indent: 25px;
}
&amp;#10;#gbaweccfdi .katex-display {
display: inline-flex !important;
margin-bottom: 0.75em !important;
}
&amp;#10;#gbaweccfdi div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
height: 0px !important;
}
&lt;/style>
&lt;table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
&lt;thead>
&lt;tr class="gt_col_headings">
&lt;th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1" scope="col" id="distribution">Distribution&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="percent_within_ci">% Within CI&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody class="gt_table_body">
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">binomial&lt;/td>
&lt;td headers="percent_within_ci" class="gt_row gt_right">96.83%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">normal&lt;/td>
&lt;td headers="percent_within_ci" class="gt_row gt_right">95.05%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">uniform&lt;/td>
&lt;td headers="percent_within_ci" class="gt_row gt_right">93.88%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">beta&lt;/td>
&lt;td headers="percent_within_ci" class="gt_row gt_right">92.59%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">chisquare&lt;/td>
&lt;td headers="percent_within_ci" class="gt_row gt_right">88.91%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">exponential&lt;/td>
&lt;td headers="percent_within_ci" class="gt_row gt_right">88.47%&lt;/td>&lt;/tr>
&lt;/tbody>
&amp;#10;
&lt;/table>
&lt;/div>
&lt;p>That looks a bit better! The binomial is higher than 95% due to its discrete nature (can’t smoothly hit all possible values), the normal and uniform distributions are close to our 95% value, but our heavily skewed beta, exponential a chi-square still aren’t up to scratch. With those heavily skewed population distributions, we need more samples before the central limit theorem ‘kicks in’.&lt;/p>
&lt;p>Let’s re-run using the dataset from the start of the post, which used s sample size of 60.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">conf_intervals &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
&lt;span style="color:#75715e"># Using our original sample size of 60&lt;/span>
sample_means &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">left_join&lt;/span>(population_data_stats, by &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;distribution&amp;#39;&lt;/span>) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(
upper_ci &lt;span style="color:#f92672">=&lt;/span> sample_mean &lt;span style="color:#f92672">+&lt;/span> &lt;span style="color:#a6e22e">qnorm&lt;/span>(&lt;span style="color:#ae81ff">0.975&lt;/span>) &lt;span style="color:#f92672">*&lt;/span> (sample_sd &lt;span style="color:#f92672">/&lt;/span> &lt;span style="color:#a6e22e">sqrt&lt;/span>(sample_size)),
lower_ci &lt;span style="color:#f92672">=&lt;/span> sample_mean &lt;span style="color:#f92672">-&lt;/span> &lt;span style="color:#a6e22e">qnorm&lt;/span>(&lt;span style="color:#ae81ff">0.975&lt;/span>) &lt;span style="color:#f92672">*&lt;/span> (sample_sd &lt;span style="color:#f92672">/&lt;/span> &lt;span style="color:#a6e22e">sqrt&lt;/span>(sample_size)),
within_ci &lt;span style="color:#f92672">=&lt;/span> population_mean &lt;span style="color:#f92672">&amp;lt;&lt;/span> upper_ci &lt;span style="color:#f92672">&amp;amp;&lt;/span> population_mean &lt;span style="color:#f92672">&amp;gt;&lt;/span> lower_ci
) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">group_by&lt;/span>(distribution) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">summarise&lt;/span>(percent_within_ci &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">mean&lt;/span>(within_ci))
&lt;/code>&lt;/pre>&lt;/div>&lt;div id="aihartbnfg" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
&lt;style>#aihartbnfg table {
font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
&amp;#10;#aihartbnfg thead, #aihartbnfg tbody, #aihartbnfg tfoot, #aihartbnfg tr, #aihartbnfg td, #aihartbnfg th {
border-style: none;
}
&amp;#10;#aihartbnfg p {
margin: 0;
padding: 0;
}
&amp;#10;#aihartbnfg .gt_table {
display: table;
border-collapse: collapse;
line-height: normal;
margin-left: auto;
margin-right: auto;
color: #333333;
font-size: 16px;
font-weight: normal;
font-style: normal;
background-color: #FFFFFF;
width: auto;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #A8A8A8;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #A8A8A8;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
}
&amp;#10;#aihartbnfg .gt_caption {
padding-top: 4px;
padding-bottom: 4px;
}
&amp;#10;#aihartbnfg .gt_title {
color: #333333;
font-size: 125%;
font-weight: initial;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
border-bottom-color: #FFFFFF;
border-bottom-width: 0;
}
&amp;#10;#aihartbnfg .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
padding-top: 3px;
padding-bottom: 5px;
padding-left: 5px;
padding-right: 5px;
border-top-color: #FFFFFF;
border-top-width: 0;
}
&amp;#10;#aihartbnfg .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#aihartbnfg .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#aihartbnfg .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#aihartbnfg .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
padding-left: 5px;
padding-right: 5px;
overflow-x: hidden;
}
&amp;#10;#aihartbnfg .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
padding-top: 0;
padding-bottom: 0;
padding-left: 4px;
padding-right: 4px;
}
&amp;#10;#aihartbnfg .gt_column_spanner_outer:first-child {
padding-left: 0;
}
&amp;#10;#aihartbnfg .gt_column_spanner_outer:last-child {
padding-right: 0;
}
&amp;#10;#aihartbnfg .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 5px;
overflow-x: hidden;
display: inline-block;
width: 100%;
}
&amp;#10;#aihartbnfg .gt_spanner_row {
border-bottom-style: hidden;
}
&amp;#10;#aihartbnfg .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
text-align: left;
}
&amp;#10;#aihartbnfg .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: middle;
}
&amp;#10;#aihartbnfg .gt_from_md > :first-child {
margin-top: 0;
}
&amp;#10;#aihartbnfg .gt_from_md > :last-child {
margin-bottom: 0;
}
&amp;#10;#aihartbnfg .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
margin: 10px;
border-top-style: solid;
border-top-width: 1px;
border-top-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
overflow-x: hidden;
}
&amp;#10;#aihartbnfg .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#aihartbnfg .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
vertical-align: top;
}
&amp;#10;#aihartbnfg .gt_row_group_first td {
border-top-width: 2px;
}
&amp;#10;#aihartbnfg .gt_row_group_first th {
border-top-width: 2px;
}
&amp;#10;#aihartbnfg .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#aihartbnfg .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
&amp;#10;#aihartbnfg .gt_first_summary_row.thick {
border-top-width: 2px;
}
&amp;#10;#aihartbnfg .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#aihartbnfg .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#aihartbnfg .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: double;
border-top-width: 6px;
border-top-color: #D3D3D3;
}
&amp;#10;#aihartbnfg .gt_last_grand_summary_row_top {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: double;
border-bottom-width: 6px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#aihartbnfg .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
&amp;#10;#aihartbnfg .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#aihartbnfg .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#aihartbnfg .gt_footnote {
margin: 0px;
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#aihartbnfg .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#aihartbnfg .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#aihartbnfg .gt_left {
text-align: left;
}
&amp;#10;#aihartbnfg .gt_center {
text-align: center;
}
&amp;#10;#aihartbnfg .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
&amp;#10;#aihartbnfg .gt_font_normal {
font-weight: normal;
}
&amp;#10;#aihartbnfg .gt_font_bold {
font-weight: bold;
}
&amp;#10;#aihartbnfg .gt_font_italic {
font-style: italic;
}
&amp;#10;#aihartbnfg .gt_super {
font-size: 65%;
}
&amp;#10;#aihartbnfg .gt_footnote_marks {
font-size: 75%;
vertical-align: 0.4em;
position: initial;
}
&amp;#10;#aihartbnfg .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
&amp;#10;#aihartbnfg .gt_indent_1 {
text-indent: 5px;
}
&amp;#10;#aihartbnfg .gt_indent_2 {
text-indent: 10px;
}
&amp;#10;#aihartbnfg .gt_indent_3 {
text-indent: 15px;
}
&amp;#10;#aihartbnfg .gt_indent_4 {
text-indent: 20px;
}
&amp;#10;#aihartbnfg .gt_indent_5 {
text-indent: 25px;
}
&amp;#10;#aihartbnfg .katex-display {
display: inline-flex !important;
margin-bottom: 0.75em !important;
}
&amp;#10;#aihartbnfg div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
height: 0px !important;
}
&lt;/style>
&lt;table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
&lt;thead>
&lt;tr class="gt_col_headings">
&lt;th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1" scope="col" id="distribution">Distribution&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="percent_within_ci">% Within CI&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody class="gt_table_body">
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">binomial&lt;/td>
&lt;td headers="percent_within_ci" class="gt_row gt_right">94.86%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">normal&lt;/td>
&lt;td headers="percent_within_ci" class="gt_row gt_right">94.52%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">beta&lt;/td>
&lt;td headers="percent_within_ci" class="gt_row gt_right">94.40%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">uniform&lt;/td>
&lt;td headers="percent_within_ci" class="gt_row gt_right">94.36%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">chisquare&lt;/td>
&lt;td headers="percent_within_ci" class="gt_row gt_right">93.60%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="distribution" class="gt_row gt_left">exponential&lt;/td>
&lt;td headers="percent_within_ci" class="gt_row gt_right">92.97%&lt;/td>&lt;/tr>
&lt;/tbody>
&amp;#10;
&lt;/table>
&lt;/div>
&lt;p>Better, but even with a sample size of 60, the skewed distirbutions are still slightly under the mark.&lt;/p>
&lt;p>This begs the question: how do the sample means of these skewed distributions behave as the sample size increased? What we can do is repeatedly sample, but increase the sample size each time. We’ll go up in powers of two from 1 to 1024 samples.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">increasing_sample_size &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
&lt;span style="color:#75715e"># Sample sizes increasing in powers of 2 (1, 2, 4, 8, ...)&lt;/span>
&lt;span style="color:#a6e22e">map&lt;/span>(
&lt;span style="color:#ae81ff">2&lt;/span>&lt;span style="color:#a6e22e">^&lt;/span>(&lt;span style="color:#ae81ff">0&lt;/span>&lt;span style="color:#f92672">:&lt;/span>&lt;span style="color:#ae81ff">10&lt;/span>),
&lt;span style="color:#75715e"># Anonymous function&lt;/span>
&lt;span style="color:#a6e22e">\&lt;/span>(y) {
&lt;span style="color:#75715e"># 1000 sample means&lt;/span>
&lt;span style="color:#a6e22e">map&lt;/span>(&lt;span style="color:#ae81ff">1&lt;/span>&lt;span style="color:#f92672">:&lt;/span>&lt;span style="color:#ae81ff">1000&lt;/span>, &lt;span style="color:#f92672">~&lt;/span>&lt;span style="color:#a6e22e">take_random_sample_mean&lt;/span>(population_data, sample_size &lt;span style="color:#f92672">=&lt;/span> y)) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#75715e"># Bind them all together&lt;/span>
&lt;span style="color:#a6e22e">list_rbind&lt;/span>() &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#75715e"># Wide to long per distribution&lt;/span>
&lt;span style="color:#a6e22e">pivot_longer&lt;/span>(
&lt;span style="color:#a6e22e">everything&lt;/span>(),
names_to &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">c&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;distribution&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;.value&amp;#39;&lt;/span>),
names_pattern &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;(\\w+)_(\\w+_\\w+)&amp;#39;&lt;/span>
) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#75715e"># Add in our population means and SDs&lt;/span>
&lt;span style="color:#a6e22e">left_join&lt;/span>(population_data_stats, by &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;distribution&amp;#39;&lt;/span>) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#75715e"># Add sample size as a column&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(sample_size &lt;span style="color:#f92672">=&lt;/span> y) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#75715e"># Normalise to -&amp;gt; N(0,1)&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(clt &lt;span style="color:#f92672">=&lt;/span> (sample_mean &lt;span style="color:#f92672">-&lt;/span> population_mean) &lt;span style="color:#f92672">/&lt;/span> (population_sd &lt;span style="color:#f92672">/&lt;/span> &lt;span style="color:#a6e22e">sqrt&lt;/span>(sample_size) ))
}
) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">list_rbind&lt;/span>()
&lt;/code>&lt;/pre>&lt;/div>&lt;p>With this data we can create an animation of a Q-Q plot for the uniform versus exponential distributions, showing how the distribution of sample means changes as the sample size increases. You’ll see the distribution approaching the standard normal distribution.&lt;/p>
&lt;p>&lt;img src="index_files/figure-html/unnamed-chunk-22-1.gif" alt="">&lt;!-- -->
ou’ll see the distribution of sample means from the uniform distribution approaches a normal much faster than the expontential. It’s very subjective, but I think the uniform stsrts looking reasonably good at a sample size of 8. The exponential however takes much longer to converge to a normal.&lt;/p>
&lt;h1 id="summary">Summary&lt;/h1>
&lt;p>The central limit theorem is something I’ve read about many times and had a reasonable grasp on, but I was always wondering about how it behaved with different population distributions. Running simulations and visualising how it behaved in different scenarios has given me (and hopefully yourselves) a much clearer view on how it works, and probably more importantly where it doesn’t work well.&lt;/p></description></item><item><title>BOM On Target: Assessing the Bureau's Forecast Accuracy</title><link>https://clt.blog.foletta.net/post/2024-08-15-bom/</link><pubDate>Sun, 29 Jun 2025 00:00:00 +0000</pubDate><guid>https://clt.blog.foletta.net/post/2024-08-15-bom/</guid><description>&lt;link href="https://clt.blog.foletta.net/post/2024-08-15-bom/index_files/htmltools-fill/fill.css" rel="stylesheet" />
&lt;script src="https://clt.blog.foletta.net/post/2024-08-15-bom/index_files/htmlwidgets/htmlwidgets.js">&lt;/script>
&lt;script src="https://clt.blog.foletta.net/post/2024-08-15-bom/index_files/plotly-binding/plotly.js">&lt;/script>
&lt;script src="https://clt.blog.foletta.net/post/2024-08-15-bom/index_files/typedarray/typedarray.min.js">&lt;/script>
&lt;script src="https://clt.blog.foletta.net/post/2024-08-15-bom/index_files/jquery/jquery.min.js">&lt;/script>
&lt;link href="https://clt.blog.foletta.net/post/2024-08-15-bom/index_files/crosstalk/css/crosstalk.min.css" rel="stylesheet" />
&lt;script src="https://clt.blog.foletta.net/post/2024-08-15-bom/index_files/crosstalk/js/crosstalk.min.js">&lt;/script>
&lt;link href="https://clt.blog.foletta.net/post/2024-08-15-bom/index_files/plotly-htmlwidgets-css/plotly-htmlwidgets.css" rel="stylesheet" />
&lt;script src="https://clt.blog.foletta.net/post/2024-08-15-bom/index_files/plotly-main/plotly-latest.min.js">&lt;/script>
&lt;p>In this article we’re going to be taking a look at the forecast accuracy of Australia’s Bureau of Metorology (BOM), but before starting a quick preface.&lt;/p>
&lt;p>I had trepidation writing this article, and it almost didn’t get off the ground. I’m not a meteorologist, and I’m also a data science dilettante. Wading into this particular field without foundational knowledge felt very arrogant, and to be frank I was worried I’d make a fool out of myself. But after having a chat with one of my great friends who has a Phd in meteorology, she assuaged some of my apprehensions, so here we are. Although there’s still plenty of room for foolishness!&lt;/p>
&lt;p>This post came about after searching and being unable to find historical information on the BOM’s forecasting performance. It may be that I just missed it somewhere, so if someone knows where this may be, please reach out.&lt;/p>
&lt;h1 id="tldr">TL;DR&lt;/h1>
&lt;p>This is a hefty article, so here’s a quick rundown of what we’ll cover before diving in:&lt;/p>
&lt;ul>
&lt;li>Building temperature and 1-6 day temperature forecast datasets from 526 locations around Australia.&lt;/li>
&lt;li>Determining and visualising the accuracy of these forecasts.&lt;/li>
&lt;li>Creating a ‘jaggedness’ metric per site.&lt;/li>
&lt;li>Using Bayesian methods to build a model for predicting forecast accuracy.&lt;/li>
&lt;li>Assessing the model’s performance on in-sample and out-of-sample data.&lt;/li>
&lt;/ul>
&lt;p>Let’s go…&lt;/p>
&lt;h1 id="weather-stations">Weather Stations&lt;/h1>
&lt;p>In previous articles I’ve gone into detail about the data acquisition process, I think because it’s quite a an enjoyable process. This time however I’m going to keep it brief and give you a quick overview about how I got the data, and what the data is.&lt;/p>
&lt;p>I needed three pieces of data:&lt;/p>
&lt;ol>
&lt;li>A list of BOM weather stations&lt;/li>
&lt;li>The temperature at those weather stations over a period of time&lt;/li>
&lt;li>The temperature forecasts at each of those stations&lt;/li>
&lt;/ol>
&lt;p>Number one was easy, as the BOM provides a &lt;a href="http://www.bom.gov.au/climate/data/lists_by_element/stations.txt">list of weather stattions&lt;/a>, including their name, latitude, longitude. This list get’s filtered down from ~6,500 total active weather stations to 526 that have a world meteorological organisation (WMO) ID. Here’s their locations around Australia:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2024-08-15-bom/index_files/figure-html/unnamed-chunk-3-1.png" width="672" />&lt;/p>
&lt;h1 id="temperature-data">Temperature Data&lt;/h1>
&lt;p>To acquire the temperature and forecast data, I wrote a script which reaches out to the BOM and retrieves the temperature and the forecasts for the clostest city/town to each of the weather stations. This ran every ten minutes, which appears to be the update interval for most temperature on the BOM website.&lt;/p>
&lt;p>Here’s a view of the temperature data, broken down state by state:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2024-08-15-bom/index_files/figure-html/unnamed-chunk-5-1.png" width="672" />
That’s a total of 4251931 million temperature readings over ~8 weeks. If you squint you should be able to a general downward trend of the temperature as we move through Autumn towards Winter.&lt;/p>
&lt;p>I could easily go off on a number of tangents with this data, but I’ll just pull out the locations with the highest and lowest mean temperatures during this period:
&lt;img src="https://clt.blog.foletta.net/post/2024-08-15-bom/index_files/figure-html/unnamed-chunk-6-1.png" width="672" />
Unsurprisingly we’ve got a city in the north of the country, and a mountain in the very south. Take note of the different shapes of these graphs, we’ll discuss this later on in the article.&lt;/p>
&lt;h1 id="forecast-data">Forecast Data&lt;/h1>
&lt;p>The BOM provides a 3-hourly temperature forecasts for each location (e.g. 1:00AM, 4:00AM, … 10:00PM) out to 7 days. As with the temperature, I’ve acquired this from their website every ten minutes for each location. This means we get 8 (hourly forecasts) * 7 (days) * 526 (locations) = 29,456 forecasts every ten minutes. Most of these are identical, as the forecast hasn’t changed. I’ve pruned this data back into two datasets:&lt;/p>
&lt;ol>
&lt;li>A &lt;strong>changed forecast&lt;/strong> set, which contains rows of data where the forecast temperature for a date/time/location has changed from the previous forecast.&lt;/li>
&lt;li>A &lt;strong>day-lagged forecast&lt;/strong> set, which contains the forecasts which are eactly 1,2,3,…6 days out from the date/time we acquired the data.&lt;/li>
&lt;/ol>
&lt;p>Visualising the forecast data over time becomes messy very quickly, so instead let’s take a look at how often the BOM updates their models. This uses the &lt;em>changed forecast&lt;/em> dataset, and is a histogram of the hour in the day in which the changed forecast was published to the website:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2024-08-15-bom/index_files/figure-html/unnamed-chunk-7-1.png" width="672" />
That spike in the late afternoon? Another data point to take note of, as it will be pertinent later on in this post.&lt;/p>
&lt;h1 id="forecast-accuracy">Forecast Accuracy&lt;/h1>
&lt;p>For each of the aforementioned data sets, we join them with the recorded temperature data. For each date/time we’ve got multiple forecasts ranging from 3 hours to 7 days out, and then the actual temperature that was recorded at that date/time. This allows us to calculate the forecast accuracy as &lt;em>(forecast temp - recorded temp)&lt;/em>.&lt;/p>
&lt;p>Let’s get to the crux of this article: forecast accuracy. There’s a small fly in the ointment, in that the published forecast temperatures integers, whereas the temperatures have a single decimal place. So a decision: do we round the recorded temperatures to whole numbers as well, or keep them as reals? I’ve made the decision here to continue to use the decimal temperatures, rather than round/ceil/floor to turn them into integers as well. All we can do is work with that data we’ve got, and the less assumptions or changes the better.&lt;/p>
&lt;p>To start, here’s an interactive view of both the 1 day and 6 day forecast overlaid across the recorded temperature for one site (Essendon Airport in Melbourne). Initially it’s a bit messy, but if you zoom in and click the legend to add/remove the each of the forecasts you get an idea about how it’s changed, and how far off each one was.&lt;/p>
&lt;div class="plotly html-widget html-fill-item" id="htmlwidget-1" style="width:672px;height:480px;">&lt;/div>
&lt;script type="application/json" data-for="htmlwidget-1">{"x":{"visdat":{"66274445c0ca":["function () ","plotlyVisDat"],"66273e4cd7b":["function () ","data"],"66277d9d5618":["function () ","data"]},"cur_data":"66277d9d5618","attrs":{"66274445c0ca":{"x":{},"y":{},"mode":"lines+markers","name":"Recorded Temperature","alpha_stroke":1,"sizes":[10,100],"spans":[1,20],"type":"scatter"},"66273e4cd7b":{"x":{},"y":{},"mode":"lines+markers","name":"6 Day Forecast","alpha_stroke":1,"sizes":[10,100],"spans":[1,20],"type":"scatter","alpha":0.59999999999999998,"inherit":true},"66277d9d5618":{"x":{},"y":{},"mode":"lines+markers","name":"1 Day Forecast","alpha_stroke":1,"sizes":[10,100],"spans":[1,20],"type":"scatter","alpha":0.59999999999999998,"inherit":true}},"layout":{"margin":{"b":40,"l":60,"t":25,"r":10},"title":"Essendon, VIC: Forecast vs. Recorded Temperature","xaxis":{"domain":[0,1],"automargin":true,"title":"Date/Time"},"yaxis":{"domain":[0,1],"automargin":true,"title":"Temperature (°C)"},"hovermode":"closest","showlegend":true},"source":"A","config":{"modeBarButtonsToAdd":["hoverclosest","hovercompare"],"showSendToCloud":false},"data":[{"x":["2025-04-08 10:00:00.000000","2025-04-08 13:00:00.000000","2025-04-08 16:00:00.000000","2025-04-08 19:00:00.000000","2025-04-08 22:00:00.000000","2025-04-09 04:00:00.000000","2025-04-09 07:00:00.000000","2025-04-09 10:00:00.000000","2025-04-09 13:00:00.000000","2025-04-09 16:00:00.000000","2025-04-09 19:00:00.000000","2025-04-09 22:00:00.000000","2025-04-10 04:00:00.000000","2025-04-10 07:00:00.000000","2025-04-10 10:00:00.000000","2025-04-10 13:00:00.000000","2025-04-10 16:00:00.000000","2025-04-10 19:00:00.000000","2025-04-10 22:00:00.000000","2025-04-11 04:00:00.000000","2025-04-11 07:00:00.000000","2025-04-11 10:00:00.000000","2025-04-11 13:00:00.000000","2025-04-11 16:00:00.000000","2025-04-11 19:00:00.000000","2025-04-11 22:00:00.000000","2025-04-12 04:00:00.000000","2025-04-12 07:00:00.000000","2025-04-12 10:00:00.000000","2025-04-12 13:00:00.000000","2025-04-12 16:00:00.000000","2025-04-12 19:00:00.000000","2025-04-12 22:00:00.000000","2025-04-13 01:00:00.000000","2025-04-13 07:00:00.000000","2025-04-13 10:00:00.000000","2025-04-13 13:00:00.000000","2025-04-13 16:00:00.000000","2025-04-13 19:00:00.000000","2025-04-13 22:00:00.000000","2025-04-14 01:00:00.000000","2025-04-14 04:00:00.000000","2025-04-14 07:00:00.000000","2025-04-14 10:00:00.000000","2025-04-14 13:00:00.000000","2025-04-14 16:00:00.000000","2025-04-14 19:00:00.000000","2025-04-14 22:00:00.000000","2025-04-15 01:00:00.000000","2025-04-15 04:00:00.000000","2025-04-15 07:00:00.000000","2025-04-15 10:00:00.000000","2025-04-15 13:00:00.000000","2025-04-15 16:00:00.000000","2025-04-15 19:00:00.000000","2025-04-15 22:00:00.000000","2025-04-16 01:00:00.000000","2025-04-16 04:00:00.000000","2025-04-16 07:00:00.000000","2025-04-16 10:00:00.000000","2025-04-16 13:00:00.000000","2025-04-16 16:00:00.000000","2025-04-16 19:00:00.000000","2025-04-16 22:00:00.000000","2025-04-17 01:00:00.000000","2025-04-17 04:00:00.000000","2025-04-17 07:00:00.000000","2025-04-17 10:00:00.000000","2025-04-17 13:00:00.000000","2025-04-17 16:00:00.000000","2025-04-17 19:00:00.000000","2025-04-17 22:00:00.000000","2025-04-18 01:00:00.000000","2025-04-18 04:00:00.000000","2025-04-18 07:00:00.000000","2025-04-18 10:00:00.000000","2025-04-18 13:00:00.000000","2025-04-18 16:00:00.000000","2025-04-18 19:00:00.000000","2025-04-18 22:00:00.000000","2025-04-19 01:00:00.000000","2025-04-19 04:00:00.000000","2025-04-19 07:00:00.000000","2025-04-19 10:00:00.000000","2025-04-19 13:00:00.000000","2025-04-19 16:00:00.000000","2025-04-19 19:00:00.000000","2025-04-19 22:00:00.000000","2025-04-20 01:00:00.000000","2025-04-20 04:00:00.000000","2025-04-20 07:00:00.000000","2025-04-20 10:00:00.000000","2025-04-20 13:00:00.000000","2025-04-20 16:00:00.000000","2025-04-20 19:00:00.000000","2025-04-20 22:00:00.000000","2025-04-21 01:00:00.000000","2025-04-21 04:00:00.000000","2025-04-21 07:00:00.000000","2025-04-21 10:00:00.000000","2025-04-21 13:00:00.000000","2025-04-21 16:00:00.000000","2025-04-21 19:00:00.000000","2025-04-21 22:00:00.000000","2025-04-22 01:00:00.000000","2025-04-22 04:00:00.000000","2025-04-22 07:00:00.000000","2025-04-22 10:00:00.000000","2025-04-22 13:00:00.000000","2025-04-22 16:00:00.000000","2025-04-22 19:00:00.000000","2025-04-22 22:00:00.000000","2025-04-23 01:00:00.000000","2025-04-23 04:00:00.000000","2025-04-23 07:00:00.000000","2025-04-23 10:00:00.000000","2025-04-23 16:00:00.000000","2025-04-23 19:00:00.000000","2025-04-23 22:00:00.000000","2025-04-24 01:00:00.000000","2025-04-24 04:00:00.000000","2025-04-24 07:00:00.000000","2025-04-24 10:00:00.000000","2025-04-24 13:00:00.000000","2025-04-24 16:00:00.000000","2025-04-24 19:00:00.000000","2025-04-24 22:00:00.000000","2025-04-25 01:00:00.000000","2025-04-25 04:00:00.000000","2025-04-25 07:00:00.000000","2025-04-25 10:00:00.000000","2025-04-25 13:00:00.000000","2025-04-25 16:00:00.000000","2025-04-25 19:00:00.000000","2025-04-25 22:00:00.000000","2025-04-26 01:00:00.000000","2025-04-26 04:00:00.000000","2025-04-26 07:00:00.000000","2025-04-26 10:00:00.000000","2025-04-26 13:00:00.000000","2025-04-26 16:00:00.000000","2025-04-26 19:00:00.000000","2025-04-26 22:00:00.000000","2025-04-27 01:00:00.000000","2025-04-27 04:00:00.000000","2025-04-27 07:00:00.000000","2025-04-27 10:00:00.000000","2025-04-27 13:00:00.000000","2025-04-27 16:00:00.000000","2025-04-27 19:00:00.000000","2025-04-27 22:00:00.000000","2025-04-28 01:00:00.000000","2025-04-28 04:00:00.000000","2025-04-28 07:00:00.000000","2025-04-28 10:00:00.000000","2025-04-28 13:00:00.000000","2025-04-28 16:00:00.000000","2025-04-28 19:00:00.000000","2025-04-28 22:00:00.000000","2025-04-29 01:00:00.000000","2025-04-29 04:00:00.000000","2025-04-29 07:00:00.000000","2025-04-29 10:00:00.000000","2025-04-29 13:00:00.000000","2025-04-29 16:00:00.000000","2025-04-29 19:00:00.000000","2025-04-29 22:00:00.000000","2025-04-30 01:00:00.000000","2025-04-30 07:00:00.000000","2025-04-30 10:00:00.000000","2025-04-30 13:00:00.000000","2025-04-30 16:00:00.000000","2025-04-30 19:00:00.000000","2025-04-30 22:00:00.000000","2025-05-01 01:00:00.000000","2025-05-01 04:00:00.000000","2025-05-01 07:00:00.000000","2025-05-01 10:00:00.000000","2025-05-01 13:00:00.000000","2025-05-01 16:00:00.000000","2025-05-01 19:00:00.000000","2025-05-01 22:00:00.000000","2025-05-02 01:00:00.000000","2025-05-02 04:00:00.000000","2025-05-02 07:00:00.000000","2025-05-02 10:00:00.000000","2025-05-02 13:00:00.000000","2025-05-02 16:00:00.000000","2025-05-02 19:00:00.000000","2025-05-02 22:00:00.000000","2025-05-03 01:00:00.000000","2025-05-03 04:00:00.000000","2025-05-03 07:00:00.000000","2025-05-03 10:00:00.000000","2025-05-03 13:00:00.000000","2025-05-03 16:00:00.000000","2025-05-03 19:00:00.000000","2025-05-03 22:00:00.000000","2025-05-04 01:00:00.000000","2025-05-04 04:00:00.000000","2025-05-04 07:00:00.000000","2025-05-04 10:00:00.000000","2025-05-04 13:00:00.000000","2025-05-04 16:00:00.000000","2025-05-04 19:00:00.000000","2025-05-04 22:00:00.000000","2025-05-05 01:00:00.000000","2025-05-05 04:00:00.000000","2025-05-05 07:00:00.000000","2025-05-05 10:00:00.000000","2025-05-05 13:00:00.000000","2025-05-05 16:00:00.000000","2025-05-05 19:00:00.000000","2025-05-05 22:00:00.000000","2025-05-06 01:00:00.000000","2025-05-06 04:00:00.000000","2025-05-06 07:00:00.000000","2025-05-06 10:00:00.000000","2025-05-06 13:00:00.000000","2025-05-06 16:00:00.000000","2025-05-06 19:00:00.000000","2025-05-06 22:00:00.000000","2025-05-07 01:00:00.000000","2025-05-07 04:00:00.000000","2025-05-07 07:00:00.000000","2025-05-07 10:00:00.000000","2025-05-07 13:00:00.000000","2025-05-07 16:00:00.000000","2025-05-07 19:00:00.000000","2025-05-07 22:00:00.000000","2025-05-08 01:00:00.000000","2025-05-08 04:00:00.000000","2025-05-08 07:00:00.000000","2025-05-08 10:00:00.000000","2025-05-08 13:00:00.000000","2025-05-08 16:00:00.000000","2025-05-08 19:00:00.000000","2025-05-08 22:00:00.000000","2025-05-09 01:00:00.000000","2025-05-09 04:00:00.000000","2025-05-09 07:00:00.000000","2025-05-09 10:00:00.000000","2025-05-09 13:00:00.000000","2025-05-09 16:00:00.000000","2025-05-09 19:00:00.000000","2025-05-09 22:00:00.000000","2025-05-10 01:00:00.000000","2025-05-10 04:00:00.000000","2025-05-10 07:00:00.000000","2025-05-10 10:00:00.000000","2025-05-10 13:00:00.000000","2025-05-10 16:00:00.000000","2025-05-10 19:00:00.000000","2025-05-10 22:00:00.000000","2025-05-11 01:00:00.000000","2025-05-11 04:00:00.000000","2025-05-11 07:00:00.000000","2025-05-11 10:00:00.000000","2025-05-11 13:00:00.000000","2025-05-11 16:00:00.000000","2025-05-11 19:00:00.000000","2025-05-11 22:00:00.000000","2025-05-12 01:00:00.000000","2025-05-12 04:00:00.000000","2025-05-12 07:00:00.000000","2025-05-12 10:00:00.000000","2025-05-12 13:00:00.000000","2025-05-12 16:00:00.000000","2025-05-12 19:00:00.000000","2025-05-13 01:00:00.000000","2025-05-13 04:00:00.000000","2025-05-13 07:00:00.000000","2025-05-13 10:00:00.000000","2025-05-13 13:00:00.000000","2025-05-13 16:00:00.000000","2025-05-13 19:00:00.000000","2025-05-13 22:00:00.000000","2025-05-14 01:00:00.000000","2025-05-14 04:00:00.000000","2025-05-14 07:00:00.000000","2025-05-14 10:00:00.000000","2025-05-14 13:00:00.000000","2025-05-14 16:00:00.000000","2025-05-14 19:00:00.000000","2025-05-14 22:00:00.000000","2025-05-15 01:00:00.000000","2025-05-15 04:00:00.000000","2025-05-15 07:00:00.000000","2025-05-15 10:00:00.000000","2025-05-15 13:00:00.000000","2025-05-15 16:00:00.000000","2025-05-15 19:00:00.000000","2025-05-15 22:00:00.000000","2025-05-16 01:00:00.000000","2025-05-16 04:00:00.000000","2025-05-16 07:00:00.000000","2025-05-16 10:00:00.000000","2025-05-16 13:00:00.000000","2025-05-16 16:00:00.000000","2025-05-16 19:00:00.000000","2025-05-16 22:00:00.000000","2025-05-17 01:00:00.000000","2025-05-17 04:00:00.000000","2025-05-17 07:00:00.000000","2025-05-17 10:00:00.000000","2025-05-17 13:00:00.000000","2025-05-17 16:00:00.000000","2025-05-17 19:00:00.000000","2025-05-17 22:00:00.000000","2025-05-18 01:00:00.000000","2025-05-18 04:00:00.000000","2025-05-18 07:00:00.000000","2025-05-18 10:00:00.000000","2025-05-18 13:00:00.000000","2025-05-18 16:00:00.000000","2025-05-18 19:00:00.000000","2025-05-18 22:00:00.000000","2025-05-19 01:00:00.000000","2025-05-19 04:00:00.000000","2025-05-19 07:00:00.000000","2025-05-19 10:00:00.000000","2025-05-19 13:00:00.000000","2025-05-19 16:00:00.000000","2025-05-19 19:00:00.000000","2025-05-19 22:00:00.000000","2025-05-20 01:00:00.000000","2025-05-20 07:00:00.000000","2025-05-20 10:00:00.000000","2025-05-20 13:00:00.000000","2025-05-20 16:00:00.000000","2025-05-20 19:00:00.000000","2025-05-20 22:00:00.000000","2025-05-21 01:00:00.000000","2025-05-21 04:00:00.000000","2025-05-21 07:00:00.000000","2025-05-21 10:00:00.000000","2025-05-21 13:00:00.000000","2025-05-21 16:00:00.000000","2025-05-21 19:00:00.000000","2025-05-21 22:00:00.000000","2025-05-22 01:00:00.000000","2025-05-22 04:00:00.000000","2025-05-22 07:00:00.000000","2025-05-22 10:00:00.000000","2025-05-22 13:00:00.000000","2025-05-22 16:00:00.000000","2025-05-22 19:00:00.000000","2025-05-22 22:00:00.000000","2025-05-23 01:00:00.000000","2025-05-23 04:00:00.000000","2025-05-23 07:00:00.000000","2025-05-23 10:00:00.000000","2025-05-23 13:00:00.000000","2025-05-23 16:00:00.000000","2025-05-23 19:00:00.000000","2025-05-23 22:00:00.000000","2025-05-24 01:00:00.000000","2025-05-24 04:00:00.000000","2025-05-24 07:00:00.000000","2025-05-24 10:00:00.000000","2025-05-24 13:00:00.000000","2025-05-24 16:00:00.000000","2025-05-24 19:00:00.000000","2025-05-24 22:00:00.000000","2025-05-25 01:00:00.000000","2025-05-25 04:00:00.000000","2025-05-25 07:00:00.000000","2025-05-25 10:00:00.000000","2025-05-25 13:00:00.000000","2025-05-25 16:00:00.000000","2025-05-25 19:00:00.000000","2025-05-25 22:00:00.000000","2025-05-26 01:00:00.000000","2025-05-26 04:00:00.000000","2025-05-26 07:00:00.000000","2025-05-26 10:00:00.000000","2025-05-26 13:00:00.000000","2025-05-26 16:00:00.000000","2025-05-26 19:00:00.000000","2025-05-26 22:00:00.000000","2025-05-27 01:00:00.000000","2025-05-27 04:00:00.000000","2025-05-27 07:00:00.000000","2025-05-27 10:00:00.000000","2025-05-27 13:00:00.000000","2025-05-27 16:00:00.000000","2025-05-27 19:00:00.000000","2025-05-27 22:00:00.000000","2025-05-28 01:00:00.000000","2025-05-28 04:00:00.000000","2025-05-28 07:00:00.000000","2025-05-28 10:00:00.000000","2025-05-28 13:00:00.000000","2025-05-28 16:00:00.000000","2025-05-28 19:00:00.000000","2025-05-28 22:00:00.000000","2025-05-29 01:00:00.000000","2025-05-29 04:00:00.000000","2025-05-29 07:00:00.000000","2025-05-29 10:00:00.000000","2025-05-29 13:00:00.000000","2025-05-29 16:00:00.000000","2025-05-29 19:00:00.000000","2025-05-29 22:00:00.000000","2025-05-30 01:00:00.000000","2025-05-30 04:00:00.000000","2025-05-30 07:00:00.000000","2025-05-30 10:00:00.000000","2025-05-30 13:00:00.000000","2025-05-30 16:00:00.000000","2025-05-30 19:00:00.000000","2025-05-30 22:00:00.000000","2025-05-31 01:00:00.000000","2025-05-31 04:00:00.000000","2025-05-31 07:00:00.000000","2025-05-31 10:00:00.000000","2025-05-31 13:00:00.000000","2025-05-31 16:00:00.000000","2025-05-31 19:00:00.000000","2025-05-31 22:00:00.000000","2025-06-01 01:00:00.000000","2025-06-01 04:00:00.000000","2025-06-01 07:00:00.000000","2025-06-01 10:00:00.000000","2025-06-01 13:00:00.000000","2025-06-01 16:00:00.000000","2025-06-01 19:00:00.000000","2025-06-01 22:00:00.000000","2025-06-02 01:00:00.000000","2025-06-02 04:00:00.000000","2025-06-02 07:00:00.000000","2025-06-02 10:00:00.000000","2025-06-02 13:00:00.000000","2025-06-02 16:00:00.000000","2025-06-02 19:00:00.000000","2025-06-02 22:00:00.000000","2025-06-03 01:00:00.000000","2025-06-03 04:00:00.000000","2025-06-03 07:00:00.000000","2025-06-03 10:00:00.000000","2025-06-03 13:00:00.000000","2025-06-03 16:00:00.000000","2025-06-03 19:00:00.000000","2025-06-03 22:00:00.000000","2025-06-04 01:00:00.000000","2025-06-04 04:00:00.000000","2025-06-04 07:00:00.000000","2025-06-04 10:00:00.000000","2025-06-04 13:00:00.000000","2025-06-04 16:00:00.000000","2025-06-04 19:00:00.000000","2025-06-04 22:00:00.000000","2025-06-05 01:00:00.000000","2025-06-05 04:00:00.000000","2025-06-05 07:00:00.000000","2025-06-05 10:00:00.000000","2025-06-05 13:00:00.000000","2025-06-05 16:00:00.000000","2025-06-05 19:00:00.000000","2025-06-06 01:00:00.000000","2025-06-06 04:00:00.000000","2025-06-06 07:00:00.000000"],"y":[15.5,16.899999999999999,18.100000000000001,14.199999999999999,12.199999999999999,9.5,8.8000000000000007,16.600000000000001,22.100000000000001,19.199999999999999,16.199999999999999,16,14.6,15.5,21.600000000000001,26.100000000000001,27.699999999999999,23.100000000000001,21.100000000000001,15.800000000000001,15.9,19.300000000000001,20.899999999999999,20.300000000000001,17.699999999999999,14.1,12.199999999999999,11.300000000000001,20.199999999999999,28,27.699999999999999,22,18.800000000000001,22.300000000000001,21.5,26.600000000000001,31.300000000000001,30.800000000000001,26.300000000000001,24.600000000000001,18,18.100000000000001,17.800000000000001,19.199999999999999,20.899999999999999,20.899999999999999,18.100000000000001,15.4,13.800000000000001,13,12.1,18.199999999999999,24.899999999999999,27.600000000000001,21.5,18.699999999999999,17.5,16.5,18.300000000000001,20.699999999999999,26.300000000000001,26.5,23.600000000000001,21.399999999999999,17.699999999999999,17.199999999999999,14.6,25.600000000000001,29.399999999999999,29,24.199999999999999,20.300000000000001,19.800000000000001,18.800000000000001,19.199999999999999,25.399999999999999,30.100000000000001,30.600000000000001,18.199999999999999,17.399999999999999,16.899999999999999,21.300000000000001,21.300000000000001,24.300000000000001,27.800000000000001,27.699999999999999,25.800000000000001,22.899999999999999,21.399999999999999,21.699999999999999,20.300000000000001,19.699999999999999,24.5,22.100000000000001,18.699999999999999,16.300000000000001,15.5,14.9,14,18.699999999999999,21,19.100000000000001,16.899999999999999,14.4,14.1,13.4,13.800000000000001,14.699999999999999,17.699999999999999,18.300000000000001,17.600000000000001,15.300000000000001,14.4,12.4,11.9,17.800000000000001,23.5,17.800000000000001,16.100000000000001,14.9,13.699999999999999,17.399999999999999,23.199999999999999,25.199999999999999,26,23.199999999999999,23,21.699999999999999,17.699999999999999,16,23,22.699999999999999,18.300000000000001,17.100000000000001,16.600000000000001,15.699999999999999,15.4,15.300000000000001,16.800000000000001,19,17.100000000000001,16.800000000000001,16.699999999999999,16.199999999999999,15.6,15.300000000000001,16.5,15.199999999999999,16.399999999999999,13.199999999999999,13.4,11.699999999999999,11.1,12.300000000000001,16.300000000000001,18.100000000000001,17,14.9,14.199999999999999,13.699999999999999,13,13.4,15.9,16.5,14.300000000000001,14.5,13.5,12.800000000000001,9.9000000000000004,15.800000000000001,18.300000000000001,16.699999999999999,13.300000000000001,11.199999999999999,10.300000000000001,9,7.5,13.5,16.600000000000001,17.800000000000001,13.4,11.300000000000001,9.3000000000000007,7.5,6.4000000000000004,14.699999999999999,18.899999999999999,20.100000000000001,14.300000000000001,13.199999999999999,11,8.6999999999999993,8.8000000000000007,16.300000000000001,20.300000000000001,19.899999999999999,15.5,11.6,14.699999999999999,15.300000000000001,13.800000000000001,18.100000000000001,22.399999999999999,21.800000000000001,17.699999999999999,16.199999999999999,16.600000000000001,15.5,15.4,18.199999999999999,23,24.399999999999999,22.199999999999999,19.899999999999999,19.399999999999999,18.899999999999999,17.600000000000001,20.800000000000001,25.800000000000001,25.100000000000001,22.199999999999999,19.800000000000001,17.5,15.4,15.6,19.300000000000001,16.899999999999999,17.800000000000001,14.1,11.300000000000001,8.9000000000000004,8.5999999999999996,8.1999999999999993,13.1,15.9,15.1,12.5,12.9,11.9,10.6,8.6999999999999993,14.1,16.399999999999999,17,12.800000000000001,10.800000000000001,8.8000000000000007,7.5,5.5,14.199999999999999,18.699999999999999,17.800000000000001,13.4,11.800000000000001,9.4000000000000004,7.9000000000000004,11.199999999999999,18.199999999999999,21.699999999999999,22.199999999999999,18.100000000000001,14.699999999999999,13.5,13.300000000000001,10.199999999999999,18.100000000000001,19.899999999999999,18.5,14.6,10.800000000000001,9.8000000000000007,9.4000000000000004,16.199999999999999,20.800000000000001,21.100000000000001,16.100000000000001,13.199999999999999,11.5,11.800000000000001,10.800000000000001,16.300000000000001,19.699999999999999,17.399999999999999,12.5,12.300000000000001,10.199999999999999,10.1,10.199999999999999,15.4,16.899999999999999,16.5,11.300000000000001,11.1,8.4000000000000004,7.4000000000000004,6.4000000000000004,14.5,16.100000000000001,16.800000000000001,15.5,14.1,12.6,10.9,10.6,10.800000000000001,11.699999999999999,10.800000000000001,9.1999999999999993,8.6999999999999993,9,6.7000000000000002,8,11.199999999999999,11.800000000000001,12.199999999999999,9,6.0999999999999996,5.2000000000000002,4,3,10.199999999999999,13.300000000000001,13.199999999999999,8.9000000000000004,7.7999999999999998,6,4.5,12.1,16.899999999999999,15.1,11.5,7.7000000000000002,5.5999999999999996,3.2999999999999998,1.5,9.9000000000000004,14.699999999999999,14.300000000000001,10.1,8.5,8,5.5,5.4000000000000004,9.5999999999999996,13.9,15.300000000000001,14.800000000000001,13.800000000000001,13.1,13,12.6,13.699999999999999,15.800000000000001,15.699999999999999,14.6,12.5,13.5,13.800000000000001,14,14.800000000000001,17.600000000000001,18.100000000000001,15.5,14.9,13.699999999999999,11.800000000000001,10,13.6,17.100000000000001,17.5,14.6,16,14.9,14.6,13.6,15.4,18.5,19.5,15.5,12.699999999999999,10.4,9.5,8.6999999999999993,12.199999999999999,12.300000000000001,13.300000000000001,10.800000000000001,10.5,11,11.1,11.1,12.5,14.9,13.699999999999999,11.6,11.4,11.199999999999999,11.1,10.800000000000001,13.300000000000001,14.9,14.6,10.800000000000001,10.5,11.1,11.199999999999999,11.1,12.800000000000001,14.5,14.300000000000001,10.300000000000001,11.4,11.300000000000001,11.4,11.300000000000001,13.1,14.9,14.5,10.800000000000001,8.3000000000000007,7,6.7000000000000002,9.1999999999999993,13.9,16.5,16.800000000000001,13.9,9.9000000000000004,12,11,11.199999999999999,14.1,18,18.399999999999999,14.1,12.4,12,11.699999999999999,11.6,12,10.5,9.3000000000000007,8.8000000000000007,7.9000000000000004,7.0999999999999996,7.2000000000000002,9.4000000000000004,10.9,12.9,11.699999999999999,9.5999999999999996,7.5999999999999996,6.7000000000000002,6.9000000000000004,4.4000000000000004,11.699999999999999,14.4,13.4,11,10.800000000000001,10.6,10],"mode":"lines+markers","name":"Recorded Temperature","type":"scatter","marker":{"color":"rgba(31,119,180,1)","line":{"color":"rgba(31,119,180,1)"}},"error_y":{"color":"rgba(31,119,180,1)"},"error_x":{"color":"rgba(31,119,180,1)"},"line":{"color":"rgba(31,119,180,1)"},"xaxis":"x","yaxis":"y","frame":null},{"x":["2025-04-08 10:00:00.000000","2025-04-08 13:00:00.000000","2025-04-08 16:00:00.000000","2025-04-08 19:00:00.000000","2025-04-08 22:00:00.000000","2025-04-09 04:00:00.000000","2025-04-09 07:00:00.000000","2025-04-09 10:00:00.000000","2025-04-09 13:00:00.000000","2025-04-09 16:00:00.000000","2025-04-09 19:00:00.000000","2025-04-09 22:00:00.000000","2025-04-10 04:00:00.000000","2025-04-10 07:00:00.000000","2025-04-10 10:00:00.000000","2025-04-10 13:00:00.000000","2025-04-10 16:00:00.000000","2025-04-10 19:00:00.000000","2025-04-10 22:00:00.000000","2025-04-11 04:00:00.000000","2025-04-11 07:00:00.000000","2025-04-11 10:00:00.000000","2025-04-11 13:00:00.000000","2025-04-11 16:00:00.000000","2025-04-11 19:00:00.000000","2025-04-11 22:00:00.000000","2025-04-12 04:00:00.000000","2025-04-12 07:00:00.000000","2025-04-12 10:00:00.000000","2025-04-12 13:00:00.000000","2025-04-12 16:00:00.000000","2025-04-12 19:00:00.000000","2025-04-12 22:00:00.000000","2025-04-13 01:00:00.000000","2025-04-13 07:00:00.000000","2025-04-13 10:00:00.000000","2025-04-13 13:00:00.000000","2025-04-13 16:00:00.000000","2025-04-13 19:00:00.000000","2025-04-13 22:00:00.000000","2025-04-14 01:00:00.000000","2025-04-14 04:00:00.000000","2025-04-14 07:00:00.000000","2025-04-14 10:00:00.000000","2025-04-14 13:00:00.000000","2025-04-14 16:00:00.000000","2025-04-14 19:00:00.000000","2025-04-14 22:00:00.000000","2025-04-15 01:00:00.000000","2025-04-15 04:00:00.000000","2025-04-15 07:00:00.000000","2025-04-15 10:00:00.000000","2025-04-15 13:00:00.000000","2025-04-15 16:00:00.000000","2025-04-15 19:00:00.000000","2025-04-15 22:00:00.000000","2025-04-16 01:00:00.000000","2025-04-16 04:00:00.000000","2025-04-16 07:00:00.000000","2025-04-16 10:00:00.000000","2025-04-16 13:00:00.000000","2025-04-16 16:00:00.000000","2025-04-16 19:00:00.000000","2025-04-16 22:00:00.000000","2025-04-17 01:00:00.000000","2025-04-17 04:00:00.000000","2025-04-17 07:00:00.000000","2025-04-17 10:00:00.000000","2025-04-17 13:00:00.000000","2025-04-17 16:00:00.000000","2025-04-17 19:00:00.000000","2025-04-17 22:00:00.000000","2025-04-18 01:00:00.000000","2025-04-18 04:00:00.000000","2025-04-18 07:00:00.000000","2025-04-18 10:00:00.000000","2025-04-18 13:00:00.000000","2025-04-18 16:00:00.000000","2025-04-18 19:00:00.000000","2025-04-18 22:00:00.000000","2025-04-19 01:00:00.000000","2025-04-19 04:00:00.000000","2025-04-19 07:00:00.000000","2025-04-19 10:00:00.000000","2025-04-19 13:00:00.000000","2025-04-19 16:00:00.000000","2025-04-19 19:00:00.000000","2025-04-19 22:00:00.000000","2025-04-20 01:00:00.000000","2025-04-20 04:00:00.000000","2025-04-20 07:00:00.000000","2025-04-20 10:00:00.000000","2025-04-20 13:00:00.000000","2025-04-20 16:00:00.000000","2025-04-20 19:00:00.000000","2025-04-20 22:00:00.000000","2025-04-21 01:00:00.000000","2025-04-21 04:00:00.000000","2025-04-21 07:00:00.000000","2025-04-21 10:00:00.000000","2025-04-21 13:00:00.000000","2025-04-21 16:00:00.000000","2025-04-21 19:00:00.000000","2025-04-21 22:00:00.000000","2025-04-22 01:00:00.000000","2025-04-22 04:00:00.000000","2025-04-22 07:00:00.000000","2025-04-22 10:00:00.000000","2025-04-22 13:00:00.000000","2025-04-22 16:00:00.000000","2025-04-22 19:00:00.000000","2025-04-22 22:00:00.000000","2025-04-23 01:00:00.000000","2025-04-23 04:00:00.000000","2025-04-23 07:00:00.000000","2025-04-23 10:00:00.000000","2025-04-23 16:00:00.000000","2025-04-23 19:00:00.000000","2025-04-23 22:00:00.000000","2025-04-24 01:00:00.000000","2025-04-24 04:00:00.000000","2025-04-24 07:00:00.000000","2025-04-24 10:00:00.000000","2025-04-24 13:00:00.000000","2025-04-24 16:00:00.000000","2025-04-24 19:00:00.000000","2025-04-24 22:00:00.000000","2025-04-25 01:00:00.000000","2025-04-25 04:00:00.000000","2025-04-25 07:00:00.000000","2025-04-25 10:00:00.000000","2025-04-25 13:00:00.000000","2025-04-25 16:00:00.000000","2025-04-25 19:00:00.000000","2025-04-25 22:00:00.000000","2025-04-26 01:00:00.000000","2025-04-26 04:00:00.000000","2025-04-26 07:00:00.000000","2025-04-26 10:00:00.000000","2025-04-26 13:00:00.000000","2025-04-26 16:00:00.000000","2025-04-26 19:00:00.000000","2025-04-26 22:00:00.000000","2025-04-27 01:00:00.000000","2025-04-27 04:00:00.000000","2025-04-27 07:00:00.000000","2025-04-27 10:00:00.000000","2025-04-27 13:00:00.000000","2025-04-27 16:00:00.000000","2025-04-27 19:00:00.000000","2025-04-27 22:00:00.000000","2025-04-28 01:00:00.000000","2025-04-28 04:00:00.000000","2025-04-28 07:00:00.000000","2025-04-28 10:00:00.000000","2025-04-28 13:00:00.000000","2025-04-28 16:00:00.000000","2025-04-28 19:00:00.000000","2025-04-28 22:00:00.000000","2025-04-29 01:00:00.000000","2025-04-29 04:00:00.000000","2025-04-29 07:00:00.000000","2025-04-29 10:00:00.000000","2025-04-29 13:00:00.000000","2025-04-29 16:00:00.000000","2025-04-29 19:00:00.000000","2025-04-29 22:00:00.000000","2025-04-30 01:00:00.000000","2025-04-30 07:00:00.000000","2025-04-30 10:00:00.000000","2025-04-30 13:00:00.000000","2025-04-30 16:00:00.000000","2025-04-30 19:00:00.000000","2025-04-30 22:00:00.000000","2025-05-01 01:00:00.000000","2025-05-01 04:00:00.000000","2025-05-01 07:00:00.000000","2025-05-01 10:00:00.000000","2025-05-01 13:00:00.000000","2025-05-01 16:00:00.000000","2025-05-01 19:00:00.000000","2025-05-01 22:00:00.000000","2025-05-02 01:00:00.000000","2025-05-02 04:00:00.000000","2025-05-02 07:00:00.000000","2025-05-02 10:00:00.000000","2025-05-02 13:00:00.000000","2025-05-02 16:00:00.000000","2025-05-02 19:00:00.000000","2025-05-02 22:00:00.000000","2025-05-03 01:00:00.000000","2025-05-03 04:00:00.000000","2025-05-03 07:00:00.000000","2025-05-03 10:00:00.000000","2025-05-03 13:00:00.000000","2025-05-03 16:00:00.000000","2025-05-03 19:00:00.000000","2025-05-03 22:00:00.000000","2025-05-04 01:00:00.000000","2025-05-04 04:00:00.000000","2025-05-04 07:00:00.000000","2025-05-04 10:00:00.000000","2025-05-04 13:00:00.000000","2025-05-04 16:00:00.000000","2025-05-04 19:00:00.000000","2025-05-04 22:00:00.000000","2025-05-05 01:00:00.000000","2025-05-05 04:00:00.000000","2025-05-05 07:00:00.000000","2025-05-05 10:00:00.000000","2025-05-05 13:00:00.000000","2025-05-05 16:00:00.000000","2025-05-05 19:00:00.000000","2025-05-05 22:00:00.000000","2025-05-06 01:00:00.000000","2025-05-06 04:00:00.000000","2025-05-06 07:00:00.000000","2025-05-06 10:00:00.000000","2025-05-06 13:00:00.000000","2025-05-06 16:00:00.000000","2025-05-06 19:00:00.000000","2025-05-06 22:00:00.000000","2025-05-07 01:00:00.000000","2025-05-07 04:00:00.000000","2025-05-07 07:00:00.000000","2025-05-07 10:00:00.000000","2025-05-07 13:00:00.000000","2025-05-07 16:00:00.000000","2025-05-07 19:00:00.000000","2025-05-07 22:00:00.000000","2025-05-08 01:00:00.000000","2025-05-08 04:00:00.000000","2025-05-08 07:00:00.000000","2025-05-08 10:00:00.000000","2025-05-08 13:00:00.000000","2025-05-08 16:00:00.000000","2025-05-08 19:00:00.000000","2025-05-08 22:00:00.000000","2025-05-09 01:00:00.000000","2025-05-09 04:00:00.000000","2025-05-09 07:00:00.000000","2025-05-09 10:00:00.000000","2025-05-09 13:00:00.000000","2025-05-09 16:00:00.000000","2025-05-09 19:00:00.000000","2025-05-09 22:00:00.000000","2025-05-10 01:00:00.000000","2025-05-10 04:00:00.000000","2025-05-10 07:00:00.000000","2025-05-10 10:00:00.000000","2025-05-10 13:00:00.000000","2025-05-10 16:00:00.000000","2025-05-10 19:00:00.000000","2025-05-10 22:00:00.000000","2025-05-11 01:00:00.000000","2025-05-11 04:00:00.000000","2025-05-11 07:00:00.000000","2025-05-11 10:00:00.000000","2025-05-11 13:00:00.000000","2025-05-11 16:00:00.000000","2025-05-11 19:00:00.000000","2025-05-11 22:00:00.000000","2025-05-12 01:00:00.000000","2025-05-12 04:00:00.000000","2025-05-12 07:00:00.000000","2025-05-12 10:00:00.000000","2025-05-12 13:00:00.000000","2025-05-12 16:00:00.000000","2025-05-12 19:00:00.000000","2025-05-13 01:00:00.000000","2025-05-13 04:00:00.000000","2025-05-13 07:00:00.000000","2025-05-13 10:00:00.000000","2025-05-13 13:00:00.000000","2025-05-13 16:00:00.000000","2025-05-13 19:00:00.000000","2025-05-13 22:00:00.000000","2025-05-14 01:00:00.000000","2025-05-14 04:00:00.000000","2025-05-14 07:00:00.000000","2025-05-14 10:00:00.000000","2025-05-14 13:00:00.000000","2025-05-14 16:00:00.000000","2025-05-14 19:00:00.000000","2025-05-14 22:00:00.000000","2025-05-15 01:00:00.000000","2025-05-15 04:00:00.000000","2025-05-15 07:00:00.000000","2025-05-15 10:00:00.000000","2025-05-15 13:00:00.000000","2025-05-15 16:00:00.000000","2025-05-15 19:00:00.000000","2025-05-15 22:00:00.000000","2025-05-16 01:00:00.000000","2025-05-16 04:00:00.000000","2025-05-16 07:00:00.000000","2025-05-16 10:00:00.000000","2025-05-16 13:00:00.000000","2025-05-16 16:00:00.000000","2025-05-16 19:00:00.000000","2025-05-16 22:00:00.000000","2025-05-17 01:00:00.000000","2025-05-17 04:00:00.000000","2025-05-17 07:00:00.000000","2025-05-17 10:00:00.000000","2025-05-17 13:00:00.000000","2025-05-17 16:00:00.000000","2025-05-17 19:00:00.000000","2025-05-17 22:00:00.000000","2025-05-18 01:00:00.000000","2025-05-18 04:00:00.000000","2025-05-18 07:00:00.000000","2025-05-18 10:00:00.000000","2025-05-18 13:00:00.000000","2025-05-18 16:00:00.000000","2025-05-18 19:00:00.000000","2025-05-18 22:00:00.000000","2025-05-19 01:00:00.000000","2025-05-19 04:00:00.000000","2025-05-19 07:00:00.000000","2025-05-19 10:00:00.000000","2025-05-19 13:00:00.000000","2025-05-19 16:00:00.000000","2025-05-19 19:00:00.000000","2025-05-19 22:00:00.000000","2025-05-20 01:00:00.000000","2025-05-20 07:00:00.000000","2025-05-20 10:00:00.000000","2025-05-20 13:00:00.000000","2025-05-20 16:00:00.000000","2025-05-20 19:00:00.000000","2025-05-20 22:00:00.000000","2025-05-21 01:00:00.000000","2025-05-21 04:00:00.000000","2025-05-21 07:00:00.000000","2025-05-21 10:00:00.000000","2025-05-21 13:00:00.000000","2025-05-21 16:00:00.000000","2025-05-21 19:00:00.000000","2025-05-21 22:00:00.000000","2025-05-22 01:00:00.000000","2025-05-22 04:00:00.000000","2025-05-22 07:00:00.000000","2025-05-22 10:00:00.000000","2025-05-22 13:00:00.000000","2025-05-22 16:00:00.000000","2025-05-22 19:00:00.000000","2025-05-22 22:00:00.000000","2025-05-23 01:00:00.000000","2025-05-23 04:00:00.000000","2025-05-23 07:00:00.000000","2025-05-23 10:00:00.000000","2025-05-23 13:00:00.000000","2025-05-23 16:00:00.000000","2025-05-23 19:00:00.000000","2025-05-23 22:00:00.000000","2025-05-24 01:00:00.000000","2025-05-24 04:00:00.000000","2025-05-24 07:00:00.000000","2025-05-24 10:00:00.000000","2025-05-24 13:00:00.000000","2025-05-24 16:00:00.000000","2025-05-24 19:00:00.000000","2025-05-24 22:00:00.000000","2025-05-25 01:00:00.000000","2025-05-25 04:00:00.000000","2025-05-25 07:00:00.000000","2025-05-25 10:00:00.000000","2025-05-25 13:00:00.000000","2025-05-25 16:00:00.000000","2025-05-25 19:00:00.000000","2025-05-25 22:00:00.000000","2025-05-26 01:00:00.000000","2025-05-26 04:00:00.000000","2025-05-26 07:00:00.000000","2025-05-26 10:00:00.000000","2025-05-26 13:00:00.000000","2025-05-26 16:00:00.000000","2025-05-26 19:00:00.000000","2025-05-26 22:00:00.000000","2025-05-27 01:00:00.000000","2025-05-27 04:00:00.000000","2025-05-27 07:00:00.000000","2025-05-27 10:00:00.000000","2025-05-27 13:00:00.000000","2025-05-27 16:00:00.000000","2025-05-27 19:00:00.000000","2025-05-27 22:00:00.000000","2025-05-28 01:00:00.000000","2025-05-28 04:00:00.000000","2025-05-28 07:00:00.000000","2025-05-28 10:00:00.000000","2025-05-28 13:00:00.000000","2025-05-28 16:00:00.000000","2025-05-28 19:00:00.000000","2025-05-28 22:00:00.000000","2025-05-29 01:00:00.000000","2025-05-29 04:00:00.000000","2025-05-29 07:00:00.000000","2025-05-29 10:00:00.000000","2025-05-29 13:00:00.000000","2025-05-29 16:00:00.000000","2025-05-29 19:00:00.000000","2025-05-29 22:00:00.000000","2025-05-30 01:00:00.000000","2025-05-30 04:00:00.000000","2025-05-30 07:00:00.000000","2025-05-30 10:00:00.000000","2025-05-30 13:00:00.000000","2025-05-30 16:00:00.000000","2025-05-30 19:00:00.000000","2025-05-30 22:00:00.000000","2025-05-31 01:00:00.000000","2025-05-31 04:00:00.000000","2025-05-31 07:00:00.000000","2025-05-31 10:00:00.000000","2025-05-31 13:00:00.000000","2025-05-31 16:00:00.000000","2025-05-31 19:00:00.000000","2025-05-31 22:00:00.000000","2025-06-01 01:00:00.000000","2025-06-01 04:00:00.000000","2025-06-01 07:00:00.000000","2025-06-01 10:00:00.000000","2025-06-01 13:00:00.000000","2025-06-01 16:00:00.000000","2025-06-01 19:00:00.000000","2025-06-01 22:00:00.000000","2025-06-02 01:00:00.000000","2025-06-02 04:00:00.000000","2025-06-02 07:00:00.000000","2025-06-02 10:00:00.000000","2025-06-02 13:00:00.000000","2025-06-02 16:00:00.000000","2025-06-02 19:00:00.000000","2025-06-02 22:00:00.000000","2025-06-03 01:00:00.000000","2025-06-03 04:00:00.000000","2025-06-03 07:00:00.000000","2025-06-03 10:00:00.000000","2025-06-03 13:00:00.000000","2025-06-03 16:00:00.000000","2025-06-03 19:00:00.000000","2025-06-03 22:00:00.000000","2025-06-04 01:00:00.000000","2025-06-04 04:00:00.000000","2025-06-04 07:00:00.000000","2025-06-04 10:00:00.000000","2025-06-04 13:00:00.000000","2025-06-04 16:00:00.000000","2025-06-04 19:00:00.000000","2025-06-04 22:00:00.000000","2025-06-05 01:00:00.000000","2025-06-05 04:00:00.000000","2025-06-05 07:00:00.000000","2025-06-05 10:00:00.000000","2025-06-05 13:00:00.000000","2025-06-05 16:00:00.000000","2025-06-05 19:00:00.000000","2025-06-06 01:00:00.000000","2025-06-06 04:00:00.000000","2025-06-06 07:00:00.000000"],"y":[15,19,19,14,12,10,10,15,19,19,16,13,10,11,18,23,25,21,17,13,13,20,27,29,20,16,13,14,22,28,30,24,20,18,18,23,28,28,24,20,17,16,15,22,27,27,22,19,17,16,16,21,25,26,21,17,15,14,14,21,26,28,25,21,19,18,18,23,27,28,24,21,19,17,16,21,25,24,20,18,17,16,14,20,24,24,21,17,14,12,11,16,19,19,16,13,12,11,10,15,18,18,16,14,13,12,11,16,19,19,16,15,14,13,13,17,21,18,15,13,12,12,18,23,24,21,18,16,15,14,17,19,19,16,14,12,11,10,14,17,17,14,11,9,8,9,15,19,20,15,13,11,11,10,13,16,16,12,10,9,9,8,13,16,16,14,12,11,10,13,17,17,12,11,10,10,9,13,17,16,12,10,8,7,8,13,18,19,14,10,9,9,9,15,22,23,16,11,10,9,10,16,23,23,17,14,13,13,13,18,24,25,20,18,16,15,13,19,25,26,18,14,12,11,11,17,23,23,15,12,10,9,8,14,18,18,15,12,10,9,9,15,21,21,15,11,9,7,8,15,22,23,15,11,9,7,8,15,20,21,17,13,10,9,9,17,23,24,15,10,9,8,14,20,20,15,13,11,10,10,15,19,19,15,13,11,9,9,14,19,19,14,12,10,9,8,13,18,18,15,13,12,11,11,15,18,18,11,9,8,7,6,11,16,17,11,9,7,5,5,11,17,18,11,7,5,5,11,17,18,13,10,9,8,8,13,18,19,15,10,9,9,8,13,17,17,15,13,12,11,11,15,19,19,13,11,9,9,8,15,20,21,16,14,13,12,11,15,19,19,16,14,13,13,11,15,19,18,14,13,12,11,10,13,16,15,12,11,11,11,10,13,15,15,13,11,11,10,9,13,16,15,13,12,11,11,10,14,16,15,14,12,11,11,10,14,17,17,13,10,9,8,7,13,17,17,14,12,11,11,10,15,19,19,15,13,11,11,10,13,15,15,12,10,9,8,7,11,13,13,9,7,6,4,4,9,14,15,11,8,8,7],"mode":"lines+markers","name":"6 Day Forecast","type":"scatter","marker":{"color":"rgba(255,127,14,0.6)","line":{"color":"rgba(255,127,14,1)"}},"error_y":{"color":"rgba(255,127,14,0.6)"},"error_x":{"color":"rgba(255,127,14,0.6)"},"line":{"color":"rgba(255,127,14,0.6)"},"xaxis":"x","yaxis":"y","frame":null},{"x":["2025-04-03 08:00:00.000000","2025-04-03 11:00:00.000000","2025-04-03 14:00:00.000000","2025-04-03 17:00:00.000000","2025-04-03 20:00:00.000000","2025-04-03 23:00:00.000000","2025-04-04 02:00:00.000000","2025-04-04 05:00:00.000000","2025-04-04 08:00:00.000000","2025-04-04 11:00:00.000000","2025-04-04 14:00:00.000000","2025-04-04 17:00:00.000000","2025-04-04 20:00:00.000000","2025-04-04 23:00:00.000000","2025-04-05 02:00:00.000000","2025-04-05 05:00:00.000000","2025-04-05 08:00:00.000000","2025-04-05 11:00:00.000000","2025-04-05 14:00:00.000000","2025-04-05 17:00:00.000000","2025-04-05 20:00:00.000000","2025-04-05 23:00:00.000000","2025-04-06 02:00:00.000000","2025-04-06 04:00:00.000000","2025-04-06 07:00:00.000000","2025-04-06 10:00:00.000000","2025-04-06 13:00:00.000000","2025-04-06 16:00:00.000000","2025-04-06 19:00:00.000000","2025-04-06 22:00:00.000000","2025-04-07 01:00:00.000000","2025-04-07 04:00:00.000000","2025-04-07 07:00:00.000000","2025-04-07 10:00:00.000000","2025-04-07 13:00:00.000000","2025-04-07 16:00:00.000000","2025-04-07 19:00:00.000000","2025-04-07 22:00:00.000000","2025-04-08 01:00:00.000000","2025-04-08 04:00:00.000000","2025-04-08 07:00:00.000000","2025-04-08 10:00:00.000000","2025-04-08 13:00:00.000000","2025-04-08 16:00:00.000000","2025-04-08 19:00:00.000000","2025-04-08 22:00:00.000000","2025-04-09 01:00:00.000000","2025-04-09 04:00:00.000000","2025-04-09 07:00:00.000000","2025-04-09 10:00:00.000000","2025-04-09 13:00:00.000000","2025-04-09 16:00:00.000000","2025-04-09 19:00:00.000000","2025-04-09 22:00:00.000000","2025-04-10 01:00:00.000000","2025-04-10 04:00:00.000000","2025-04-10 07:00:00.000000","2025-04-10 10:00:00.000000","2025-04-10 13:00:00.000000","2025-04-10 16:00:00.000000","2025-04-10 19:00:00.000000","2025-04-10 22:00:00.000000","2025-04-11 01:00:00.000000","2025-04-11 04:00:00.000000","2025-04-11 07:00:00.000000","2025-04-11 10:00:00.000000","2025-04-11 13:00:00.000000","2025-04-11 16:00:00.000000","2025-04-11 19:00:00.000000","2025-04-11 22:00:00.000000","2025-04-12 01:00:00.000000","2025-04-12 04:00:00.000000","2025-04-12 07:00:00.000000","2025-04-12 10:00:00.000000","2025-04-12 13:00:00.000000","2025-04-12 16:00:00.000000","2025-04-12 19:00:00.000000","2025-04-12 22:00:00.000000","2025-04-13 01:00:00.000000","2025-04-13 04:00:00.000000","2025-04-13 07:00:00.000000","2025-04-13 10:00:00.000000","2025-04-13 13:00:00.000000","2025-04-13 16:00:00.000000","2025-04-13 19:00:00.000000","2025-04-13 22:00:00.000000","2025-04-14 01:00:00.000000","2025-04-14 04:00:00.000000","2025-04-14 07:00:00.000000","2025-04-14 10:00:00.000000","2025-04-14 13:00:00.000000","2025-04-14 16:00:00.000000","2025-04-14 19:00:00.000000","2025-04-14 22:00:00.000000","2025-04-15 01:00:00.000000","2025-04-15 04:00:00.000000","2025-04-15 07:00:00.000000","2025-04-15 10:00:00.000000","2025-04-15 13:00:00.000000","2025-04-15 16:00:00.000000","2025-04-15 19:00:00.000000","2025-04-15 22:00:00.000000","2025-04-16 01:00:00.000000","2025-04-16 04:00:00.000000","2025-04-16 07:00:00.000000","2025-04-16 10:00:00.000000","2025-04-16 13:00:00.000000","2025-04-16 16:00:00.000000","2025-04-16 19:00:00.000000","2025-04-16 22:00:00.000000","2025-04-17 01:00:00.000000","2025-04-17 04:00:00.000000","2025-04-17 07:00:00.000000","2025-04-17 10:00:00.000000","2025-04-17 13:00:00.000000","2025-04-17 16:00:00.000000","2025-04-17 19:00:00.000000","2025-04-17 22:00:00.000000","2025-04-18 01:00:00.000000","2025-04-18 04:00:00.000000","2025-04-18 07:00:00.000000","2025-04-18 10:00:00.000000","2025-04-18 16:00:00.000000","2025-04-18 19:00:00.000000","2025-04-18 22:00:00.000000","2025-04-19 01:00:00.000000","2025-04-19 04:00:00.000000","2025-04-19 07:00:00.000000","2025-04-19 10:00:00.000000","2025-04-19 13:00:00.000000","2025-04-19 16:00:00.000000","2025-04-19 19:00:00.000000","2025-04-19 22:00:00.000000","2025-04-20 01:00:00.000000","2025-04-20 04:00:00.000000","2025-04-20 07:00:00.000000","2025-04-20 10:00:00.000000","2025-04-20 13:00:00.000000","2025-04-20 16:00:00.000000","2025-04-20 19:00:00.000000","2025-04-20 22:00:00.000000","2025-04-21 01:00:00.000000","2025-04-21 04:00:00.000000","2025-04-21 07:00:00.000000","2025-04-21 10:00:00.000000","2025-04-21 13:00:00.000000","2025-04-21 16:00:00.000000","2025-04-21 19:00:00.000000","2025-04-21 22:00:00.000000","2025-04-22 01:00:00.000000","2025-04-22 04:00:00.000000","2025-04-22 07:00:00.000000","2025-04-22 10:00:00.000000","2025-04-22 13:00:00.000000","2025-04-22 16:00:00.000000","2025-04-22 19:00:00.000000","2025-04-22 22:00:00.000000","2025-04-23 01:00:00.000000","2025-04-23 04:00:00.000000","2025-04-23 07:00:00.000000","2025-04-23 10:00:00.000000","2025-04-23 13:00:00.000000","2025-04-23 16:00:00.000000","2025-04-23 19:00:00.000000","2025-04-23 22:00:00.000000","2025-04-24 01:00:00.000000","2025-04-24 04:00:00.000000","2025-04-24 07:00:00.000000","2025-04-24 10:00:00.000000","2025-04-24 13:00:00.000000","2025-04-24 16:00:00.000000","2025-04-24 19:00:00.000000","2025-04-24 22:00:00.000000","2025-04-25 01:00:00.000000","2025-04-25 04:00:00.000000","2025-04-25 07:00:00.000000","2025-04-25 10:00:00.000000","2025-04-25 13:00:00.000000","2025-04-25 16:00:00.000000","2025-04-25 19:00:00.000000","2025-04-25 22:00:00.000000","2025-04-26 01:00:00.000000","2025-04-26 04:00:00.000000","2025-04-26 07:00:00.000000","2025-04-26 10:00:00.000000","2025-04-26 13:00:00.000000","2025-04-26 16:00:00.000000","2025-04-26 19:00:00.000000","2025-04-26 22:00:00.000000","2025-04-27 01:00:00.000000","2025-04-27 04:00:00.000000","2025-04-27 07:00:00.000000","2025-04-27 10:00:00.000000","2025-04-27 13:00:00.000000","2025-04-27 16:00:00.000000","2025-04-27 19:00:00.000000","2025-04-27 22:00:00.000000","2025-04-28 01:00:00.000000","2025-04-28 04:00:00.000000","2025-04-28 07:00:00.000000","2025-04-28 10:00:00.000000","2025-04-28 13:00:00.000000","2025-04-28 16:00:00.000000","2025-04-28 19:00:00.000000","2025-04-28 22:00:00.000000","2025-04-29 01:00:00.000000","2025-04-29 04:00:00.000000","2025-04-29 07:00:00.000000","2025-04-29 10:00:00.000000","2025-04-29 13:00:00.000000","2025-04-29 16:00:00.000000","2025-04-29 19:00:00.000000","2025-04-29 22:00:00.000000","2025-04-30 01:00:00.000000","2025-04-30 04:00:00.000000","2025-04-30 07:00:00.000000","2025-04-30 10:00:00.000000","2025-04-30 13:00:00.000000","2025-04-30 16:00:00.000000","2025-04-30 19:00:00.000000","2025-04-30 22:00:00.000000","2025-05-01 01:00:00.000000","2025-05-01 04:00:00.000000","2025-05-01 07:00:00.000000","2025-05-01 10:00:00.000000","2025-05-01 13:00:00.000000","2025-05-01 16:00:00.000000","2025-05-01 19:00:00.000000","2025-05-01 22:00:00.000000","2025-05-02 01:00:00.000000","2025-05-02 04:00:00.000000","2025-05-02 07:00:00.000000","2025-05-02 10:00:00.000000","2025-05-02 13:00:00.000000","2025-05-02 16:00:00.000000","2025-05-02 19:00:00.000000","2025-05-02 22:00:00.000000","2025-05-03 01:00:00.000000","2025-05-03 04:00:00.000000","2025-05-03 07:00:00.000000","2025-05-03 10:00:00.000000","2025-05-03 13:00:00.000000","2025-05-03 16:00:00.000000","2025-05-03 19:00:00.000000","2025-05-03 22:00:00.000000","2025-05-04 01:00:00.000000","2025-05-04 04:00:00.000000","2025-05-04 07:00:00.000000","2025-05-04 10:00:00.000000","2025-05-04 13:00:00.000000","2025-05-04 16:00:00.000000","2025-05-04 19:00:00.000000","2025-05-04 22:00:00.000000","2025-05-05 01:00:00.000000","2025-05-05 04:00:00.000000","2025-05-05 07:00:00.000000","2025-05-05 10:00:00.000000","2025-05-05 13:00:00.000000","2025-05-05 16:00:00.000000","2025-05-05 19:00:00.000000","2025-05-05 22:00:00.000000","2025-05-06 01:00:00.000000","2025-05-06 04:00:00.000000","2025-05-06 07:00:00.000000","2025-05-06 10:00:00.000000","2025-05-06 13:00:00.000000","2025-05-06 16:00:00.000000","2025-05-06 19:00:00.000000","2025-05-06 22:00:00.000000","2025-05-07 01:00:00.000000","2025-05-07 04:00:00.000000","2025-05-07 07:00:00.000000","2025-05-07 10:00:00.000000","2025-05-07 13:00:00.000000","2025-05-07 16:00:00.000000","2025-05-07 19:00:00.000000","2025-05-07 22:00:00.000000","2025-05-08 01:00:00.000000","2025-05-08 04:00:00.000000","2025-05-08 07:00:00.000000","2025-05-08 10:00:00.000000","2025-05-08 13:00:00.000000","2025-05-08 16:00:00.000000","2025-05-08 19:00:00.000000","2025-05-08 22:00:00.000000","2025-05-09 01:00:00.000000","2025-05-09 04:00:00.000000","2025-05-09 07:00:00.000000","2025-05-09 10:00:00.000000","2025-05-09 13:00:00.000000","2025-05-09 16:00:00.000000","2025-05-09 19:00:00.000000","2025-05-09 22:00:00.000000","2025-05-10 01:00:00.000000","2025-05-10 04:00:00.000000","2025-05-10 07:00:00.000000","2025-05-10 10:00:00.000000","2025-05-10 13:00:00.000000","2025-05-10 16:00:00.000000","2025-05-10 19:00:00.000000","2025-05-10 22:00:00.000000","2025-05-11 01:00:00.000000","2025-05-11 04:00:00.000000","2025-05-11 07:00:00.000000","2025-05-11 10:00:00.000000","2025-05-11 13:00:00.000000","2025-05-11 16:00:00.000000","2025-05-11 19:00:00.000000","2025-05-11 22:00:00.000000","2025-05-12 01:00:00.000000","2025-05-12 04:00:00.000000","2025-05-12 07:00:00.000000","2025-05-12 10:00:00.000000","2025-05-12 13:00:00.000000","2025-05-12 16:00:00.000000","2025-05-12 19:00:00.000000","2025-05-13 01:00:00.000000","2025-05-13 04:00:00.000000","2025-05-13 07:00:00.000000","2025-05-13 10:00:00.000000","2025-05-13 13:00:00.000000","2025-05-13 16:00:00.000000","2025-05-13 19:00:00.000000","2025-05-13 22:00:00.000000","2025-05-14 01:00:00.000000","2025-05-14 04:00:00.000000","2025-05-14 07:00:00.000000","2025-05-14 10:00:00.000000","2025-05-14 13:00:00.000000","2025-05-14 16:00:00.000000","2025-05-14 19:00:00.000000","2025-05-14 22:00:00.000000","2025-05-15 01:00:00.000000","2025-05-15 04:00:00.000000","2025-05-15 07:00:00.000000","2025-05-15 10:00:00.000000","2025-05-15 13:00:00.000000","2025-05-15 16:00:00.000000","2025-05-15 19:00:00.000000","2025-05-15 22:00:00.000000","2025-05-16 01:00:00.000000","2025-05-16 04:00:00.000000","2025-05-16 07:00:00.000000","2025-05-16 10:00:00.000000","2025-05-16 13:00:00.000000","2025-05-16 16:00:00.000000","2025-05-16 19:00:00.000000","2025-05-16 22:00:00.000000","2025-05-17 01:00:00.000000","2025-05-17 04:00:00.000000","2025-05-17 07:00:00.000000","2025-05-17 10:00:00.000000","2025-05-17 13:00:00.000000","2025-05-17 16:00:00.000000","2025-05-17 19:00:00.000000","2025-05-17 22:00:00.000000","2025-05-18 01:00:00.000000","2025-05-18 04:00:00.000000","2025-05-18 07:00:00.000000","2025-05-18 10:00:00.000000","2025-05-18 13:00:00.000000","2025-05-18 16:00:00.000000","2025-05-18 19:00:00.000000","2025-05-18 22:00:00.000000","2025-05-19 01:00:00.000000","2025-05-19 04:00:00.000000","2025-05-19 07:00:00.000000","2025-05-19 10:00:00.000000","2025-05-19 13:00:00.000000","2025-05-19 16:00:00.000000","2025-05-19 19:00:00.000000","2025-05-19 22:00:00.000000","2025-05-20 01:00:00.000000","2025-05-20 07:00:00.000000","2025-05-20 10:00:00.000000","2025-05-20 13:00:00.000000","2025-05-20 16:00:00.000000","2025-05-20 19:00:00.000000","2025-05-20 22:00:00.000000","2025-05-21 01:00:00.000000","2025-05-21 04:00:00.000000","2025-05-21 07:00:00.000000","2025-05-21 10:00:00.000000","2025-05-21 13:00:00.000000","2025-05-21 16:00:00.000000","2025-05-21 19:00:00.000000","2025-05-21 22:00:00.000000","2025-05-22 01:00:00.000000","2025-05-22 04:00:00.000000","2025-05-22 07:00:00.000000","2025-05-22 10:00:00.000000","2025-05-22 13:00:00.000000","2025-05-22 16:00:00.000000","2025-05-22 19:00:00.000000","2025-05-22 22:00:00.000000","2025-05-23 01:00:00.000000","2025-05-23 04:00:00.000000","2025-05-23 07:00:00.000000","2025-05-23 10:00:00.000000","2025-05-23 13:00:00.000000","2025-05-23 16:00:00.000000","2025-05-23 19:00:00.000000","2025-05-23 22:00:00.000000","2025-05-24 01:00:00.000000","2025-05-24 04:00:00.000000","2025-05-24 07:00:00.000000","2025-05-24 10:00:00.000000","2025-05-24 13:00:00.000000","2025-05-24 16:00:00.000000","2025-05-24 19:00:00.000000","2025-05-24 22:00:00.000000","2025-05-25 01:00:00.000000","2025-05-25 04:00:00.000000","2025-05-25 07:00:00.000000","2025-05-25 10:00:00.000000","2025-05-25 13:00:00.000000","2025-05-25 16:00:00.000000","2025-05-25 19:00:00.000000","2025-05-25 22:00:00.000000","2025-05-26 01:00:00.000000","2025-05-26 04:00:00.000000","2025-05-26 07:00:00.000000","2025-05-26 10:00:00.000000","2025-05-26 13:00:00.000000","2025-05-26 16:00:00.000000","2025-05-26 19:00:00.000000","2025-05-26 22:00:00.000000","2025-05-27 01:00:00.000000","2025-05-27 04:00:00.000000","2025-05-27 07:00:00.000000","2025-05-27 10:00:00.000000","2025-05-27 13:00:00.000000","2025-05-27 16:00:00.000000","2025-05-27 19:00:00.000000","2025-05-27 22:00:00.000000","2025-05-28 01:00:00.000000","2025-05-28 04:00:00.000000","2025-05-28 07:00:00.000000","2025-05-28 10:00:00.000000","2025-05-28 13:00:00.000000","2025-05-28 16:00:00.000000","2025-05-28 19:00:00.000000","2025-05-28 22:00:00.000000","2025-05-29 01:00:00.000000","2025-05-29 04:00:00.000000","2025-05-29 07:00:00.000000","2025-05-29 10:00:00.000000","2025-05-29 13:00:00.000000","2025-05-29 16:00:00.000000","2025-05-29 19:00:00.000000","2025-05-29 22:00:00.000000","2025-05-30 01:00:00.000000","2025-05-30 04:00:00.000000","2025-05-30 07:00:00.000000","2025-05-30 10:00:00.000000","2025-05-30 13:00:00.000000","2025-05-30 16:00:00.000000","2025-05-30 19:00:00.000000","2025-05-30 22:00:00.000000","2025-05-31 01:00:00.000000","2025-05-31 04:00:00.000000","2025-05-31 07:00:00.000000","2025-05-31 10:00:00.000000","2025-05-31 13:00:00.000000","2025-05-31 16:00:00.000000","2025-05-31 19:00:00.000000","2025-05-31 22:00:00.000000","2025-06-01 01:00:00.000000","2025-06-01 04:00:00.000000","2025-06-01 07:00:00.000000","2025-06-01 10:00:00.000000","2025-06-01 13:00:00.000000","2025-06-01 16:00:00.000000","2025-06-01 19:00:00.000000","2025-06-01 22:00:00.000000","2025-06-02 01:00:00.000000","2025-06-02 04:00:00.000000","2025-06-02 07:00:00.000000","2025-06-02 10:00:00.000000","2025-06-02 13:00:00.000000","2025-06-02 16:00:00.000000","2025-06-02 19:00:00.000000","2025-06-02 22:00:00.000000","2025-06-03 01:00:00.000000","2025-06-03 04:00:00.000000","2025-06-03 07:00:00.000000","2025-06-03 10:00:00.000000","2025-06-03 13:00:00.000000","2025-06-03 16:00:00.000000","2025-06-03 19:00:00.000000","2025-06-03 22:00:00.000000","2025-06-04 01:00:00.000000","2025-06-04 04:00:00.000000","2025-06-04 07:00:00.000000","2025-06-04 10:00:00.000000","2025-06-04 13:00:00.000000","2025-06-04 16:00:00.000000","2025-06-04 19:00:00.000000","2025-06-04 22:00:00.000000","2025-06-05 01:00:00.000000","2025-06-05 04:00:00.000000","2025-06-05 07:00:00.000000","2025-06-05 10:00:00.000000","2025-06-05 13:00:00.000000","2025-06-05 16:00:00.000000","2025-06-05 19:00:00.000000","2025-06-06 01:00:00.000000","2025-06-06 04:00:00.000000","2025-06-06 07:00:00.000000"],"y":[12,18,21,21,17,14,13,12,11,15,18,19,16,13,13,12,11,15,17,19,15,11,9,8,8,16,21,22,18,14,13,12,12,17,19,19,16,13,11,9,9,15,18,19,16,13,11,9,9,15,20,22,18,15,14,14,13,21,26,28,23,20,19,16,15,19,22,22,19,15,13,12,12,19,29,30,24,21,20,19,19,25,30,32,26,22,19,18,17,20,24,23,21,18,15,13,13,18,26,28,23,19,18,17,17,22,27,28,23,19,16,15,14,23,29,29,25,20,18,17,17,24,29,22,17,15,15,16,23,28,29,24,22,20,20,19,23,27,25,18,15,13,12,12,17,21,20,16,14,13,13,14,17,19,19,17,14,12,11,11,16,22,23,19,15,13,13,15,21,26,26,21,19,18,18,16,20,24,22,19,17,16,15,15,17,19,19,17,15,14,14,14,15,18,18,16,14,13,12,12,16,19,18,16,15,13,12,12,16,18,17,14,13,12,11,10,14,18,17,14,11,9,7,7,13,17,17,14,11,9,7,7,14,20,20,15,11,9,8,8,16,21,21,15,12,11,11,11,17,23,23,17,15,15,14,14,18,23,25,21,19,17,16,15,19,25,25,21,18,16,15,15,19,20,19,14,11,10,9,9,13,17,17,13,10,8,6,5,14,19,19,15,11,9,7,7,14,20,21,15,11,9,9,9,17,23,23,17,14,13,12,11,18,23,22,16,11,10,10,15,21,22,16,13,11,10,9,16,19,18,15,13,12,11,9,14,18,18,14,11,9,7,6,14,19,19,15,12,11,10,10,13,13,13,10,9,7,7,7,11,14,14,10,7,5,3,2,9,14,14,10,7,4,1,11,18,18,11,7,5,4,4,11,16,15,12,9,7,6,5,11,16,17,14,12,10,10,10,13,16,17,13,11,11,11,10,14,18,18,15,13,12,10,9,13,17,18,15,13,13,12,12,16,21,20,14,12,11,10,9,12,15,15,13,11,11,10,10,13,14,14,13,12,11,11,10,13,16,15,12,11,9,9,9,12,15,15,13,11,10,9,8,12,15,16,11,7,5,5,5,11,17,16,12,11,10,9,9,13,17,18,14,12,11,10,10,12,13,12,9,8,7,7,8,11,13,13,10,9,7,6,5,10,13,14,10,8,8,8],"mode":"lines+markers","name":"1 Day Forecast","type":"scatter","marker":{"color":"rgba(44,160,44,0.6)","line":{"color":"rgba(44,160,44,1)"}},"error_y":{"color":"rgba(44,160,44,0.6)"},"error_x":{"color":"rgba(44,160,44,0.6)"},"line":{"color":"rgba(44,160,44,0.6)"},"xaxis":"x","yaxis":"y","frame":null}],"highlight":{"on":"plotly_click","persistent":false,"dynamic":false,"selectize":false,"opacityDim":0.20000000000000001,"selected":{"opacity":1},"debounce":0},"shinyEvents":["plotly_hover","plotly_click","plotly_selected","plotly_relayout","plotly_brushed","plotly_brushing","plotly_clickannotation","plotly_doubleclick","plotly_deselect","plotly_afterplot","plotly_sunburstclick"],"base_url":"https://plot.ly"},"evals":[],"jsHooks":[]}&lt;/script>
&lt;p>Let’s look at some statistics across all of the weather stations: the mean error, the mean absolute error, the standard deviation of the error, and the 95% quantiles for the error.&lt;/p>
&lt;div id="loxmnxpvtx" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
&lt;style>#loxmnxpvtx table {
font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
&amp;#10;#loxmnxpvtx thead, #loxmnxpvtx tbody, #loxmnxpvtx tfoot, #loxmnxpvtx tr, #loxmnxpvtx td, #loxmnxpvtx th {
border-style: none;
}
&amp;#10;#loxmnxpvtx p {
margin: 0;
padding: 0;
}
&amp;#10;#loxmnxpvtx .gt_table {
display: table;
border-collapse: collapse;
line-height: normal;
margin-left: auto;
margin-right: auto;
color: #333333;
font-size: 16px;
font-weight: normal;
font-style: normal;
background-color: #FFFFFF;
width: auto;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #A8A8A8;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #A8A8A8;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
}
&amp;#10;#loxmnxpvtx .gt_caption {
padding-top: 4px;
padding-bottom: 4px;
}
&amp;#10;#loxmnxpvtx .gt_title {
color: #333333;
font-size: 125%;
font-weight: initial;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
border-bottom-color: #FFFFFF;
border-bottom-width: 0;
}
&amp;#10;#loxmnxpvtx .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
padding-top: 3px;
padding-bottom: 5px;
padding-left: 5px;
padding-right: 5px;
border-top-color: #FFFFFF;
border-top-width: 0;
}
&amp;#10;#loxmnxpvtx .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#loxmnxpvtx .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#loxmnxpvtx .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#loxmnxpvtx .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
padding-left: 5px;
padding-right: 5px;
overflow-x: hidden;
}
&amp;#10;#loxmnxpvtx .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
padding-top: 0;
padding-bottom: 0;
padding-left: 4px;
padding-right: 4px;
}
&amp;#10;#loxmnxpvtx .gt_column_spanner_outer:first-child {
padding-left: 0;
}
&amp;#10;#loxmnxpvtx .gt_column_spanner_outer:last-child {
padding-right: 0;
}
&amp;#10;#loxmnxpvtx .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 5px;
overflow-x: hidden;
display: inline-block;
width: 100%;
}
&amp;#10;#loxmnxpvtx .gt_spanner_row {
border-bottom-style: hidden;
}
&amp;#10;#loxmnxpvtx .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
text-align: left;
}
&amp;#10;#loxmnxpvtx .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: middle;
}
&amp;#10;#loxmnxpvtx .gt_from_md > :first-child {
margin-top: 0;
}
&amp;#10;#loxmnxpvtx .gt_from_md > :last-child {
margin-bottom: 0;
}
&amp;#10;#loxmnxpvtx .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
margin: 10px;
border-top-style: solid;
border-top-width: 1px;
border-top-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
overflow-x: hidden;
}
&amp;#10;#loxmnxpvtx .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#loxmnxpvtx .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
vertical-align: top;
}
&amp;#10;#loxmnxpvtx .gt_row_group_first td {
border-top-width: 2px;
}
&amp;#10;#loxmnxpvtx .gt_row_group_first th {
border-top-width: 2px;
}
&amp;#10;#loxmnxpvtx .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#loxmnxpvtx .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
&amp;#10;#loxmnxpvtx .gt_first_summary_row.thick {
border-top-width: 2px;
}
&amp;#10;#loxmnxpvtx .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#loxmnxpvtx .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#loxmnxpvtx .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: double;
border-top-width: 6px;
border-top-color: #D3D3D3;
}
&amp;#10;#loxmnxpvtx .gt_last_grand_summary_row_top {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: double;
border-bottom-width: 6px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#loxmnxpvtx .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
&amp;#10;#loxmnxpvtx .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#loxmnxpvtx .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#loxmnxpvtx .gt_footnote {
margin: 0px;
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#loxmnxpvtx .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#loxmnxpvtx .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#loxmnxpvtx .gt_left {
text-align: left;
}
&amp;#10;#loxmnxpvtx .gt_center {
text-align: center;
}
&amp;#10;#loxmnxpvtx .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
&amp;#10;#loxmnxpvtx .gt_font_normal {
font-weight: normal;
}
&amp;#10;#loxmnxpvtx .gt_font_bold {
font-weight: bold;
}
&amp;#10;#loxmnxpvtx .gt_font_italic {
font-style: italic;
}
&amp;#10;#loxmnxpvtx .gt_super {
font-size: 65%;
}
&amp;#10;#loxmnxpvtx .gt_footnote_marks {
font-size: 75%;
vertical-align: 0.4em;
position: initial;
}
&amp;#10;#loxmnxpvtx .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
&amp;#10;#loxmnxpvtx .gt_indent_1 {
text-indent: 5px;
}
&amp;#10;#loxmnxpvtx .gt_indent_2 {
text-indent: 10px;
}
&amp;#10;#loxmnxpvtx .gt_indent_3 {
text-indent: 15px;
}
&amp;#10;#loxmnxpvtx .gt_indent_4 {
text-indent: 20px;
}
&amp;#10;#loxmnxpvtx .gt_indent_5 {
text-indent: 25px;
}
&amp;#10;#loxmnxpvtx .katex-display {
display: inline-flex !important;
margin-bottom: 0.75em !important;
}
&amp;#10;#loxmnxpvtx div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
height: 0px !important;
}
&lt;/style>
&lt;table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
&lt;thead>
&lt;tr class="gt_heading">
&lt;td colspan="5" class="gt_heading gt_title gt_font_normal gt_bottom_border" style>Forecast Error Statistics by Forecast Duration&lt;/td>
&lt;/tr>
&amp;#10; &lt;tr class="gt_col_headings">
&lt;th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="forecast_duration">Forecast Duration&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="mean_forecast_error">Mean Error&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="mean_absolute_error">Mean Absolute Error&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="sd_forecast_error">Standard Deviation of Error&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="nine_five_quantile">95% Quantile&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody class="gt_table_body">
&lt;tr>&lt;td headers="forecast_duration" class="gt_row gt_center">86400s (~1 days)&lt;/td>
&lt;td headers="mean_forecast_error" class="gt_row gt_center">0.09&lt;/td>
&lt;td headers="mean_absolute_error" class="gt_row gt_center">1.35&lt;/td>
&lt;td headers="sd_forecast_error" class="gt_row gt_center">1.75&lt;/td>
&lt;td headers="nine_five_quantile" class="gt_row gt_center">-3.6, 3.5&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="forecast_duration" class="gt_row gt_center">172800s (~2 days)&lt;/td>
&lt;td headers="mean_forecast_error" class="gt_row gt_center">0.11&lt;/td>
&lt;td headers="mean_absolute_error" class="gt_row gt_center">1.41&lt;/td>
&lt;td headers="sd_forecast_error" class="gt_row gt_center">1.83&lt;/td>
&lt;td headers="nine_five_quantile" class="gt_row gt_center">-3.7, 3.7&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="forecast_duration" class="gt_row gt_center">259200s (~3 days)&lt;/td>
&lt;td headers="mean_forecast_error" class="gt_row gt_center">0.12&lt;/td>
&lt;td headers="mean_absolute_error" class="gt_row gt_center">1.51&lt;/td>
&lt;td headers="sd_forecast_error" class="gt_row gt_center">1.97&lt;/td>
&lt;td headers="nine_five_quantile" class="gt_row gt_center">-4, 4&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="forecast_duration" class="gt_row gt_center">345600s (~4 days)&lt;/td>
&lt;td headers="mean_forecast_error" class="gt_row gt_center">0.11&lt;/td>
&lt;td headers="mean_absolute_error" class="gt_row gt_center">1.63&lt;/td>
&lt;td headers="sd_forecast_error" class="gt_row gt_center">2.12&lt;/td>
&lt;td headers="nine_five_quantile" class="gt_row gt_center">-4.3, 4.3&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="forecast_duration" class="gt_row gt_center">432000s (~5 days)&lt;/td>
&lt;td headers="mean_forecast_error" class="gt_row gt_center">0.09&lt;/td>
&lt;td headers="mean_absolute_error" class="gt_row gt_center">1.72&lt;/td>
&lt;td headers="sd_forecast_error" class="gt_row gt_center">2.24&lt;/td>
&lt;td headers="nine_five_quantile" class="gt_row gt_center">-4.6, 4.5&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="forecast_duration" class="gt_row gt_center">518400s (~6 days)&lt;/td>
&lt;td headers="mean_forecast_error" class="gt_row gt_center">0.12&lt;/td>
&lt;td headers="mean_absolute_error" class="gt_row gt_center">1.88&lt;/td>
&lt;td headers="sd_forecast_error" class="gt_row gt_center">2.44&lt;/td>
&lt;td headers="nine_five_quantile" class="gt_row gt_center">-5, 5&lt;/td>&lt;/tr>
&lt;/tbody>
&amp;#10;
&lt;/table>
&lt;/div>
&lt;p>The mean error doesn’t tell us about the accuracy (as negative and positive errors cancel each other out), but more about the bias in the model. It’s showing as that across all forecasts there’s a slight bias towards over-estimating the temperature, but only by a tenth of a degree or so.&lt;/p>
&lt;p>The mean absolute error of changes from 1.33 degrees a day out, so 1.88 degrees six days out. The standard deviation increasing tells us that the errors are more spread out as well.&lt;/p>
&lt;p>A histogram of the errors shows their distribution and how that changes over time. I’ve overlaid a normal distribution with a fixed mean of 0 and standard deviation of 2 to act as a reference point:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2024-08-15-bom/index_files/figure-html/unnamed-chunk-11-1.png" width="672" />
This shows the mean error slightly shifting to the right and the distribution spreading out as the standard deviation increases. The distribution looks &lt;em>sort of&lt;/em> normal, and for the first two standard deviations it tracks really well. You’ll notice this with the table above, where the mean +- 2 x std.dev lines up well with the 95% quantiles.&lt;/p>
&lt;p>But we can’t assume it’s normal across the board. This is best visualised with a quantile-quantile (QQ) plot.&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2024-08-15-bom/index_files/figure-html/unnamed-chunk-12-1.png" width="672" />
This shows that these the forecast error distriubtions, like we mentioned before, track the normal for the first two standard deviation. Outside of that they have fatter tails than a normal, i.e. you’ll find more observations inside 3 or 4 standard deviations than you would in a normal distribution.&lt;/p>
&lt;p>Instead of looking at whole days between the forecast and the recorded temperature, we can switch to the changed data set which contains all of the changed forecasts over time. This gives a more granulart view of the times between when the forecast was made, and the actual recorded temperature.
&lt;img src="https://clt.blog.foletta.net/post/2024-08-15-bom/index_files/figure-html/unnamed-chunk-13-1.png" width="672" />
There’s two things that immediately jump out at me: the is first is both the variance and mean absolute error over time appear to be much greater for Victoria, Tasmania, and South Australia as opposed to the others. The second is that there appears to be a cyclical nature to the forecast error. Setting aside what I’ll call the ‘southern state’ issue, we’ll investigate the cyclical nature.&lt;/p>
&lt;p>Straight away this looks to be a difference in forecast error based on whether it’s day or night. I wouldn’t have thought that this kind of bias would have come through in the forecast durations. But if you recall the previous histogram that showed when the BOM published its updated models, this wasn’t uniformly distributed. Rather, most of the changes to forecasts occurred between 2pm and 6pm. Therefore the forecast duration - the difference between the forecast change and the forecast time - will be ‘tethered’ from this time and the day/night cycle will show in the forecast durations.&lt;/p>
&lt;p>Let’s code the forecast’s time as either ‘night’ based on whether the time is after 7pm and before 7am, and ‘day’ for all others. This makes sense for the time period of the data. Interestingly if we look at a density plot of the forecast error we see a difference in variation between night and day.&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2024-08-15-bom/index_files/figure-html/unnamed-chunk-15-1.png" width="672" />
This, combined with the fact that most of the forecasts are published at a certain time of the day, is why there is a cyclical pattern in the forecast duration graph. We won’t delve into this any further, but the question to of course ask is why would this be the case? I’m not going to dive any deeper into this in this post, but it’s something to think about.&lt;/p>
&lt;h1 id="forecast-error-modelling">Forecast Error Modelling&lt;/h1>
&lt;p>Can we find a way to model the forecast error as a function of some other parameters? The first parameter is obvious: forecast duration. Looking at the mean absolute error over the forecast duration, it increased as the forecast duration increased. This shouldn’t be surprising, as the quote goes “prediction is very hard, especially about the future”.&lt;/p>
&lt;p>But when we looked at the graph, there was a second source of variation. It appeared that the southern states - Victoria, Tasmania, South Australia - had a higher variation in mean absolute error as compared to the northern states.&lt;/p>
&lt;p>This is where I’m in danger of cosplaying as a meteorologist and being completely off the mark, but lets have a go. Looking back at the graph of the lowest mean temperature and the start of this article, taken from Tasmania in the south of Australia, what you may have noticed is that the profile of the lowest mean temperature looked different. To me eye it looked rougher and moved up and down far more as compared to the highest mean, which was from a location in the far north of Australia.&lt;/p>
&lt;p>Could this type of variation have an influence on the forecast accuracy? Can we quantify it?&lt;/p>
&lt;h2 id="the-jaggedness-index">The ‘Jaggedness’ Index&lt;/h2>
&lt;p>My guess would be as we go south, certain meteorological features that I would only be guessing at (weather coming off the Southern Ocean?) make the rises and falls in temperature less ‘smooth’ more varied, and thus more unpredictable.&lt;/p>
&lt;p>My thought is to create a ‘jaggedness’ index for each of the locations and each of the day-lagged forecast.&lt;/p>
&lt;p>$$ J = \frac{1}{n - 1} \sum_{i=2}^{n} \left| t_i - t_{i-1} \right| $$&lt;/p>
&lt;p>Where \(t_i\) is the i’th temperature observation at a site, and \(n\) is the total number of observations. In plain English: the sum of the absolute differences between consecutive temperature observations, divided by the number of pairs of observations.&lt;/p>
&lt;p>To give this some perspective, here are the temperatures over time for the locations with the highest and lowest jaggedness indexes:&lt;/p>
&lt;div id="mltlxppmjm" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
&lt;style>#mltlxppmjm table {
font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
&amp;#10;#mltlxppmjm thead, #mltlxppmjm tbody, #mltlxppmjm tfoot, #mltlxppmjm tr, #mltlxppmjm td, #mltlxppmjm th {
border-style: none;
}
&amp;#10;#mltlxppmjm p {
margin: 0;
padding: 0;
}
&amp;#10;#mltlxppmjm .gt_table {
display: table;
border-collapse: collapse;
line-height: normal;
margin-left: auto;
margin-right: auto;
color: #333333;
font-size: 16px;
font-weight: normal;
font-style: normal;
background-color: #FFFFFF;
width: auto;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #A8A8A8;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #A8A8A8;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
}
&amp;#10;#mltlxppmjm .gt_caption {
padding-top: 4px;
padding-bottom: 4px;
}
&amp;#10;#mltlxppmjm .gt_title {
color: #333333;
font-size: 125%;
font-weight: initial;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
border-bottom-color: #FFFFFF;
border-bottom-width: 0;
}
&amp;#10;#mltlxppmjm .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
padding-top: 3px;
padding-bottom: 5px;
padding-left: 5px;
padding-right: 5px;
border-top-color: #FFFFFF;
border-top-width: 0;
}
&amp;#10;#mltlxppmjm .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#mltlxppmjm .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#mltlxppmjm .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#mltlxppmjm .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
padding-left: 5px;
padding-right: 5px;
overflow-x: hidden;
}
&amp;#10;#mltlxppmjm .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
padding-top: 0;
padding-bottom: 0;
padding-left: 4px;
padding-right: 4px;
}
&amp;#10;#mltlxppmjm .gt_column_spanner_outer:first-child {
padding-left: 0;
}
&amp;#10;#mltlxppmjm .gt_column_spanner_outer:last-child {
padding-right: 0;
}
&amp;#10;#mltlxppmjm .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 5px;
overflow-x: hidden;
display: inline-block;
width: 100%;
}
&amp;#10;#mltlxppmjm .gt_spanner_row {
border-bottom-style: hidden;
}
&amp;#10;#mltlxppmjm .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
text-align: left;
}
&amp;#10;#mltlxppmjm .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: middle;
}
&amp;#10;#mltlxppmjm .gt_from_md > :first-child {
margin-top: 0;
}
&amp;#10;#mltlxppmjm .gt_from_md > :last-child {
margin-bottom: 0;
}
&amp;#10;#mltlxppmjm .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
margin: 10px;
border-top-style: solid;
border-top-width: 1px;
border-top-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
overflow-x: hidden;
}
&amp;#10;#mltlxppmjm .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#mltlxppmjm .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
vertical-align: top;
}
&amp;#10;#mltlxppmjm .gt_row_group_first td {
border-top-width: 2px;
}
&amp;#10;#mltlxppmjm .gt_row_group_first th {
border-top-width: 2px;
}
&amp;#10;#mltlxppmjm .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#mltlxppmjm .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
&amp;#10;#mltlxppmjm .gt_first_summary_row.thick {
border-top-width: 2px;
}
&amp;#10;#mltlxppmjm .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#mltlxppmjm .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#mltlxppmjm .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: double;
border-top-width: 6px;
border-top-color: #D3D3D3;
}
&amp;#10;#mltlxppmjm .gt_last_grand_summary_row_top {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: double;
border-bottom-width: 6px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#mltlxppmjm .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
&amp;#10;#mltlxppmjm .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#mltlxppmjm .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#mltlxppmjm .gt_footnote {
margin: 0px;
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#mltlxppmjm .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#mltlxppmjm .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#mltlxppmjm .gt_left {
text-align: left;
}
&amp;#10;#mltlxppmjm .gt_center {
text-align: center;
}
&amp;#10;#mltlxppmjm .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
&amp;#10;#mltlxppmjm .gt_font_normal {
font-weight: normal;
}
&amp;#10;#mltlxppmjm .gt_font_bold {
font-weight: bold;
}
&amp;#10;#mltlxppmjm .gt_font_italic {
font-style: italic;
}
&amp;#10;#mltlxppmjm .gt_super {
font-size: 65%;
}
&amp;#10;#mltlxppmjm .gt_footnote_marks {
font-size: 75%;
vertical-align: 0.4em;
position: initial;
}
&amp;#10;#mltlxppmjm .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
&amp;#10;#mltlxppmjm .gt_indent_1 {
text-indent: 5px;
}
&amp;#10;#mltlxppmjm .gt_indent_2 {
text-indent: 10px;
}
&amp;#10;#mltlxppmjm .gt_indent_3 {
text-indent: 15px;
}
&amp;#10;#mltlxppmjm .gt_indent_4 {
text-indent: 20px;
}
&amp;#10;#mltlxppmjm .gt_indent_5 {
text-indent: 25px;
}
&amp;#10;#mltlxppmjm .katex-display {
display: inline-flex !important;
margin-bottom: 0.75em !important;
}
&amp;#10;#mltlxppmjm div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
height: 0px !important;
}
&lt;/style>
&lt;table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
&lt;thead>
&lt;tr class="gt_heading">
&lt;td colspan="2" class="gt_heading gt_title gt_font_normal gt_bottom_border" style>Min/Max Site Jaggedness&lt;/td>
&lt;/tr>
&amp;#10; &lt;tr class="gt_col_headings">
&lt;th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="site">Site&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="jaggedness">Jaggedness&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody class="gt_table_body">
&lt;tr>&lt;td headers="site" class="gt_row gt_center">LOW ISLES LIGHTHOUSE, QLD&lt;/td>
&lt;td headers="jaggedness" class="gt_row gt_right">0.1252697&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="site" class="gt_row gt_center">KHANCOBAN AWS, NSW&lt;/td>
&lt;td headers="jaggedness" class="gt_row gt_right">0.4207670&lt;/td>&lt;/tr>
&lt;/tbody>
&amp;#10;
&lt;/table>
&lt;/div>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2024-08-15-bom/index_files/figure-html/unnamed-chunk-17-1.png" width="672" />&lt;/p>
&lt;p>We can now plot each locations index against its mean absolute error, and we’ll transition of the day forecast durations.&lt;/p>
&lt;p>&lt;img src="index_files/figure-html/unnamed-chunk-18-1.gif" alt="">&lt;!-- -->
We can see a reasonably linear relationship between this jaggedness index and the mean forecast error. Let’s use these two features - duration and jaggedness - as parameters into a simple linear model and see if we can come up with a model that predicts the mean absolute forecast error.&lt;/p>
&lt;h1 id="modeling">Modeling&lt;/h1>
&lt;p>To be honest, the jaggedness index isn’t a great predictor, because you can only calculate it once you’ve got the temperature, and once you’ve got the temperature you can just calculate the forecast error directly! But let’s just have a go, start simple, and see where we end up.&lt;/p>
&lt;p>We’ll use Stan to create a simple linear model with two parameters: the jaggedness as a continuous variable, and the day-lagged forecast duration as a categorical index. Here’s the full specification of the model:&lt;/p>
&lt;pre>&lt;code>data {
//Training data
int&amp;lt;lower=1&amp;gt; n;
vector[n] jaggedness;
array[n] int &amp;lt;lower=1, upper=6&amp;gt; duration;
vector[n] mean_forecast_error;
//Out of sample test set
vector[n] jaggedness_t;
array[n] int &amp;lt;lower=1, upper=6&amp;gt; duration_t;
vector[n] mean_forecast_error_t;
}
parameters {
vector[6] alpha;
vector[6] beta_jaggedness;
real&amp;lt;lower=0&amp;gt; sigma;
}
model {
// Weakly informative priors
beta_jaggedness ~ normal(0, 5);
alpha ~ normal(0, 5);
sigma ~ exponential(1);
// Likelihood
for (i in 1:n) {
mean_forecast_error[i] ~ normal(alpha[duration[i]] + beta_jaggedness[duration[i]] * jaggedness[i], sigma);
}
}
generated quantities {
//Posterior predictive check
vector[n] y_rep;
//Out of sample test set
vector[n] y_test;
for (i in 1:n) {
//Posterior predictive checks
y_rep[i] = normal_rng(
alpha[duration[i]] + beta_jaggedness[duration[i]] * jaggedness[i],
sigma
);
//Out of sample checks
y_test[i] = normal_rng(
alpha[duration_t[i]] + beta_jaggedness[duration_t[i]] * jaggedness_t[i],
sigma
);
}
}
&lt;/code>&lt;/pre>
&lt;p>After sampling from the posterior (and performing some of the standard checks like trace plots and histograms that I’m not going to show here), we get our alpha and beta parameters (intercept and slope) for each of the indexes (where index 1 = 1 day forecast, index 2 = 2 day forecast, etc), as well as some other statistics about the distributions of each parameter. For completeness here’s a table of all of the parameters;&lt;/p>
&lt;div id="uvepahhuot" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
&lt;style>#uvepahhuot table {
font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
&amp;#10;#uvepahhuot thead, #uvepahhuot tbody, #uvepahhuot tfoot, #uvepahhuot tr, #uvepahhuot td, #uvepahhuot th {
border-style: none;
}
&amp;#10;#uvepahhuot p {
margin: 0;
padding: 0;
}
&amp;#10;#uvepahhuot .gt_table {
display: table;
border-collapse: collapse;
line-height: normal;
margin-left: auto;
margin-right: auto;
color: #333333;
font-size: 16px;
font-weight: normal;
font-style: normal;
background-color: #FFFFFF;
width: auto;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #A8A8A8;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #A8A8A8;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
}
&amp;#10;#uvepahhuot .gt_caption {
padding-top: 4px;
padding-bottom: 4px;
}
&amp;#10;#uvepahhuot .gt_title {
color: #333333;
font-size: 125%;
font-weight: initial;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
border-bottom-color: #FFFFFF;
border-bottom-width: 0;
}
&amp;#10;#uvepahhuot .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
padding-top: 3px;
padding-bottom: 5px;
padding-left: 5px;
padding-right: 5px;
border-top-color: #FFFFFF;
border-top-width: 0;
}
&amp;#10;#uvepahhuot .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#uvepahhuot .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#uvepahhuot .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#uvepahhuot .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
padding-left: 5px;
padding-right: 5px;
overflow-x: hidden;
}
&amp;#10;#uvepahhuot .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
padding-top: 0;
padding-bottom: 0;
padding-left: 4px;
padding-right: 4px;
}
&amp;#10;#uvepahhuot .gt_column_spanner_outer:first-child {
padding-left: 0;
}
&amp;#10;#uvepahhuot .gt_column_spanner_outer:last-child {
padding-right: 0;
}
&amp;#10;#uvepahhuot .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 5px;
overflow-x: hidden;
display: inline-block;
width: 100%;
}
&amp;#10;#uvepahhuot .gt_spanner_row {
border-bottom-style: hidden;
}
&amp;#10;#uvepahhuot .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
text-align: left;
}
&amp;#10;#uvepahhuot .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: middle;
}
&amp;#10;#uvepahhuot .gt_from_md > :first-child {
margin-top: 0;
}
&amp;#10;#uvepahhuot .gt_from_md > :last-child {
margin-bottom: 0;
}
&amp;#10;#uvepahhuot .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
margin: 10px;
border-top-style: solid;
border-top-width: 1px;
border-top-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
overflow-x: hidden;
}
&amp;#10;#uvepahhuot .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#uvepahhuot .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
vertical-align: top;
}
&amp;#10;#uvepahhuot .gt_row_group_first td {
border-top-width: 2px;
}
&amp;#10;#uvepahhuot .gt_row_group_first th {
border-top-width: 2px;
}
&amp;#10;#uvepahhuot .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#uvepahhuot .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
&amp;#10;#uvepahhuot .gt_first_summary_row.thick {
border-top-width: 2px;
}
&amp;#10;#uvepahhuot .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#uvepahhuot .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#uvepahhuot .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: double;
border-top-width: 6px;
border-top-color: #D3D3D3;
}
&amp;#10;#uvepahhuot .gt_last_grand_summary_row_top {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: double;
border-bottom-width: 6px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#uvepahhuot .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
&amp;#10;#uvepahhuot .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#uvepahhuot .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#uvepahhuot .gt_footnote {
margin: 0px;
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#uvepahhuot .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#uvepahhuot .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#uvepahhuot .gt_left {
text-align: left;
}
&amp;#10;#uvepahhuot .gt_center {
text-align: center;
}
&amp;#10;#uvepahhuot .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
&amp;#10;#uvepahhuot .gt_font_normal {
font-weight: normal;
}
&amp;#10;#uvepahhuot .gt_font_bold {
font-weight: bold;
}
&amp;#10;#uvepahhuot .gt_font_italic {
font-style: italic;
}
&amp;#10;#uvepahhuot .gt_super {
font-size: 65%;
}
&amp;#10;#uvepahhuot .gt_footnote_marks {
font-size: 75%;
vertical-align: 0.4em;
position: initial;
}
&amp;#10;#uvepahhuot .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
&amp;#10;#uvepahhuot .gt_indent_1 {
text-indent: 5px;
}
&amp;#10;#uvepahhuot .gt_indent_2 {
text-indent: 10px;
}
&amp;#10;#uvepahhuot .gt_indent_3 {
text-indent: 15px;
}
&amp;#10;#uvepahhuot .gt_indent_4 {
text-indent: 20px;
}
&amp;#10;#uvepahhuot .gt_indent_5 {
text-indent: 25px;
}
&amp;#10;#uvepahhuot .katex-display {
display: inline-flex !important;
margin-bottom: 0.75em !important;
}
&amp;#10;#uvepahhuot div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
height: 0px !important;
}
&lt;/style>
&lt;table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
&lt;thead>
&lt;tr class="gt_heading">
&lt;td colspan="6" class="gt_heading gt_title gt_font_normal gt_bottom_border" style>Model Parameter: Posterior Distribution Statistics&lt;/td>
&lt;/tr>
&amp;#10; &lt;tr class="gt_col_headings">
&lt;th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" style="font-weight: bold;" scope="col" id="variable">Variable&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" style="font-weight: bold;" scope="col" id="mean">Mean&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" style="font-weight: bold;" scope="col" id="median">Median&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" style="font-weight: bold;" scope="col" id="sd">Std. Dev&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" style="font-weight: bold;" scope="col" id="a2.5%">Lower 2.5%&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" style="font-weight: bold;" scope="col" id="a97.5%">Upper 97.5&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody class="gt_table_body">
&lt;tr>&lt;td headers="variable" class="gt_row gt_center">sigma&lt;/td>
&lt;td headers="mean" class="gt_row gt_center">0.31&lt;/td>
&lt;td headers="median" class="gt_row gt_center">0.31&lt;/td>
&lt;td headers="sd" class="gt_row gt_center">0.00&lt;/td>
&lt;td headers="2.5%" class="gt_row gt_center">0.30&lt;/td>
&lt;td headers="97.5%" class="gt_row gt_center">0.32&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="variable" class="gt_row gt_center">alpha[1]&lt;/td>
&lt;td headers="mean" class="gt_row gt_center">0.78&lt;/td>
&lt;td headers="median" class="gt_row gt_center">0.78&lt;/td>
&lt;td headers="sd" class="gt_row gt_center">0.06&lt;/td>
&lt;td headers="2.5%" class="gt_row gt_center">0.66&lt;/td>
&lt;td headers="97.5%" class="gt_row gt_center">0.91&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="variable" class="gt_row gt_center">alpha[2]&lt;/td>
&lt;td headers="mean" class="gt_row gt_center">0.76&lt;/td>
&lt;td headers="median" class="gt_row gt_center">0.76&lt;/td>
&lt;td headers="sd" class="gt_row gt_center">0.06&lt;/td>
&lt;td headers="2.5%" class="gt_row gt_center">0.64&lt;/td>
&lt;td headers="97.5%" class="gt_row gt_center">0.89&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="variable" class="gt_row gt_center">alpha[3]&lt;/td>
&lt;td headers="mean" class="gt_row gt_center">0.76&lt;/td>
&lt;td headers="median" class="gt_row gt_center">0.76&lt;/td>
&lt;td headers="sd" class="gt_row gt_center">0.06&lt;/td>
&lt;td headers="2.5%" class="gt_row gt_center">0.63&lt;/td>
&lt;td headers="97.5%" class="gt_row gt_center">0.88&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="variable" class="gt_row gt_center">alpha[4]&lt;/td>
&lt;td headers="mean" class="gt_row gt_center">0.81&lt;/td>
&lt;td headers="median" class="gt_row gt_center">0.81&lt;/td>
&lt;td headers="sd" class="gt_row gt_center">0.06&lt;/td>
&lt;td headers="2.5%" class="gt_row gt_center">0.69&lt;/td>
&lt;td headers="97.5%" class="gt_row gt_center">0.93&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="variable" class="gt_row gt_center">alpha[5]&lt;/td>
&lt;td headers="mean" class="gt_row gt_center">0.85&lt;/td>
&lt;td headers="median" class="gt_row gt_center">0.85&lt;/td>
&lt;td headers="sd" class="gt_row gt_center">0.06&lt;/td>
&lt;td headers="2.5%" class="gt_row gt_center">0.73&lt;/td>
&lt;td headers="97.5%" class="gt_row gt_center">0.97&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="variable" class="gt_row gt_center">alpha[6]&lt;/td>
&lt;td headers="mean" class="gt_row gt_center">0.82&lt;/td>
&lt;td headers="median" class="gt_row gt_center">0.81&lt;/td>
&lt;td headers="sd" class="gt_row gt_center">0.06&lt;/td>
&lt;td headers="2.5%" class="gt_row gt_center">0.69&lt;/td>
&lt;td headers="97.5%" class="gt_row gt_center">0.94&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="variable" class="gt_row gt_center">beta_jaggedness[1]&lt;/td>
&lt;td headers="mean" class="gt_row gt_center">2.14&lt;/td>
&lt;td headers="median" class="gt_row gt_center">2.14&lt;/td>
&lt;td headers="sd" class="gt_row gt_center">0.24&lt;/td>
&lt;td headers="2.5%" class="gt_row gt_center">1.66&lt;/td>
&lt;td headers="97.5%" class="gt_row gt_center">2.59&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="variable" class="gt_row gt_center">beta_jaggedness[2]&lt;/td>
&lt;td headers="mean" class="gt_row gt_center">2.43&lt;/td>
&lt;td headers="median" class="gt_row gt_center">2.43&lt;/td>
&lt;td headers="sd" class="gt_row gt_center">0.24&lt;/td>
&lt;td headers="2.5%" class="gt_row gt_center">1.96&lt;/td>
&lt;td headers="97.5%" class="gt_row gt_center">2.89&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="variable" class="gt_row gt_center">beta_jaggedness[3]&lt;/td>
&lt;td headers="mean" class="gt_row gt_center">2.84&lt;/td>
&lt;td headers="median" class="gt_row gt_center">2.84&lt;/td>
&lt;td headers="sd" class="gt_row gt_center">0.23&lt;/td>
&lt;td headers="2.5%" class="gt_row gt_center">2.40&lt;/td>
&lt;td headers="97.5%" class="gt_row gt_center">3.29&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="variable" class="gt_row gt_center">beta_jaggedness[4]&lt;/td>
&lt;td headers="mean" class="gt_row gt_center">3.08&lt;/td>
&lt;td headers="median" class="gt_row gt_center">3.08&lt;/td>
&lt;td headers="sd" class="gt_row gt_center">0.24&lt;/td>
&lt;td headers="2.5%" class="gt_row gt_center">2.62&lt;/td>
&lt;td headers="97.5%" class="gt_row gt_center">3.54&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="variable" class="gt_row gt_center">beta_jaggedness[5]&lt;/td>
&lt;td headers="mean" class="gt_row gt_center">3.27&lt;/td>
&lt;td headers="median" class="gt_row gt_center">3.27&lt;/td>
&lt;td headers="sd" class="gt_row gt_center">0.23&lt;/td>
&lt;td headers="2.5%" class="gt_row gt_center">2.83&lt;/td>
&lt;td headers="97.5%" class="gt_row gt_center">3.73&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="variable" class="gt_row gt_center">beta_jaggedness[6]&lt;/td>
&lt;td headers="mean" class="gt_row gt_center">4.02&lt;/td>
&lt;td headers="median" class="gt_row gt_center">4.02&lt;/td>
&lt;td headers="sd" class="gt_row gt_center">0.23&lt;/td>
&lt;td headers="2.5%" class="gt_row gt_center">3.55&lt;/td>
&lt;td headers="97.5%" class="gt_row gt_center">4.47&lt;/td>&lt;/tr>
&lt;/tbody>
&amp;#10;
&lt;/table>
&lt;/div>
&lt;p>I’m not going to go through each one, but let’s take a look at index &lt;em>[6]&lt;/em>: the six day forecast. The parameters tell us the following information:&lt;/p>
&lt;ul>
&lt;li>&lt;em>alpha[6]&lt;/em> - for a location with zero jaggedness (the intercept), you would expect on average a mean forecast error of 0.82, with a credible interval between .7 and .94.&lt;/li>
&lt;li>&lt;em>beta_jaggedness[6]&lt;/em> - for each increase of jaggedness by 1, you would expect on average the mean forecast error to go up by 4 degrees, with a 95% credible interval of between 3.56 and 4.46.&lt;/li>
&lt;/ul>
&lt;p>Tables are hard to interpret, so instead what we’ll do is plot all a line for each of the parameters drawn from the posterior, which shows the range of values the model believes parameters could be. As before, we transition through each of the forecast durations:&lt;/p>
&lt;p>&lt;img src="index_files/figure-html/unnamed-chunk-23-1.gif" alt="">&lt;!-- -->
Some reasonably tight credible intervals for the parameters. But there’s one parameter left we haven’t looked at: sigma, the residual standard deviation. It’s a measure of how far the observed data deviates from our fitted values We’ve used a single parameter, so it doesn’t vary across the forecast indexes/durations.&lt;/p>
&lt;p>We can check how well our model has done by performing posterior predictive checks. In our Stan model we generated random data for each of the observations using all of the alpha, beta, and sigma values drawn from the posterior. We’ve then limited the data to those values that are within the 95% credible interval, and plotted each one in green. We’ve then overlaid the real observations in orange over the top.&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2024-08-15-bom/index_files/figure-html/unnamed-chunk-24-1.png" width="672" />
Most of our data lies within the bands of the generated data, which is good. However there’s a couple of things that don’t look quite right: the first is that there’s a bias in the model. Almost all of the values which don’t fit are above the bands, rather than below. Our model, which uses normal residuals/errors, assumes that the errors should be approximately symmetric, which they are not.&lt;/p>
&lt;p>Secondly, those values above the bands are a long way from where the model would expect them to be. So our model doesn’t do a great job of predicting the mean forecast error based on jaggedness for those sites.&lt;/p>
&lt;p>If we calculate the 95% interval of our generated values from the model, then calculate the percentage of recorded values that fall in that range, we get the following results:&lt;/p>
&lt;div id="avmfejydrj" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
&lt;style>#avmfejydrj table {
font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
&amp;#10;#avmfejydrj thead, #avmfejydrj tbody, #avmfejydrj tfoot, #avmfejydrj tr, #avmfejydrj td, #avmfejydrj th {
border-style: none;
}
&amp;#10;#avmfejydrj p {
margin: 0;
padding: 0;
}
&amp;#10;#avmfejydrj .gt_table {
display: table;
border-collapse: collapse;
line-height: normal;
margin-left: auto;
margin-right: auto;
color: #333333;
font-size: 16px;
font-weight: normal;
font-style: normal;
background-color: #FFFFFF;
width: auto;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #A8A8A8;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #A8A8A8;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
}
&amp;#10;#avmfejydrj .gt_caption {
padding-top: 4px;
padding-bottom: 4px;
}
&amp;#10;#avmfejydrj .gt_title {
color: #333333;
font-size: 125%;
font-weight: initial;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
border-bottom-color: #FFFFFF;
border-bottom-width: 0;
}
&amp;#10;#avmfejydrj .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
padding-top: 3px;
padding-bottom: 5px;
padding-left: 5px;
padding-right: 5px;
border-top-color: #FFFFFF;
border-top-width: 0;
}
&amp;#10;#avmfejydrj .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#avmfejydrj .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#avmfejydrj .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#avmfejydrj .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
padding-left: 5px;
padding-right: 5px;
overflow-x: hidden;
}
&amp;#10;#avmfejydrj .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
padding-top: 0;
padding-bottom: 0;
padding-left: 4px;
padding-right: 4px;
}
&amp;#10;#avmfejydrj .gt_column_spanner_outer:first-child {
padding-left: 0;
}
&amp;#10;#avmfejydrj .gt_column_spanner_outer:last-child {
padding-right: 0;
}
&amp;#10;#avmfejydrj .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 5px;
overflow-x: hidden;
display: inline-block;
width: 100%;
}
&amp;#10;#avmfejydrj .gt_spanner_row {
border-bottom-style: hidden;
}
&amp;#10;#avmfejydrj .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
text-align: left;
}
&amp;#10;#avmfejydrj .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: middle;
}
&amp;#10;#avmfejydrj .gt_from_md > :first-child {
margin-top: 0;
}
&amp;#10;#avmfejydrj .gt_from_md > :last-child {
margin-bottom: 0;
}
&amp;#10;#avmfejydrj .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
margin: 10px;
border-top-style: solid;
border-top-width: 1px;
border-top-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
overflow-x: hidden;
}
&amp;#10;#avmfejydrj .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#avmfejydrj .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
vertical-align: top;
}
&amp;#10;#avmfejydrj .gt_row_group_first td {
border-top-width: 2px;
}
&amp;#10;#avmfejydrj .gt_row_group_first th {
border-top-width: 2px;
}
&amp;#10;#avmfejydrj .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#avmfejydrj .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
&amp;#10;#avmfejydrj .gt_first_summary_row.thick {
border-top-width: 2px;
}
&amp;#10;#avmfejydrj .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#avmfejydrj .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#avmfejydrj .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: double;
border-top-width: 6px;
border-top-color: #D3D3D3;
}
&amp;#10;#avmfejydrj .gt_last_grand_summary_row_top {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: double;
border-bottom-width: 6px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#avmfejydrj .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
&amp;#10;#avmfejydrj .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#avmfejydrj .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#avmfejydrj .gt_footnote {
margin: 0px;
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#avmfejydrj .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#avmfejydrj .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#avmfejydrj .gt_left {
text-align: left;
}
&amp;#10;#avmfejydrj .gt_center {
text-align: center;
}
&amp;#10;#avmfejydrj .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
&amp;#10;#avmfejydrj .gt_font_normal {
font-weight: normal;
}
&amp;#10;#avmfejydrj .gt_font_bold {
font-weight: bold;
}
&amp;#10;#avmfejydrj .gt_font_italic {
font-style: italic;
}
&amp;#10;#avmfejydrj .gt_super {
font-size: 65%;
}
&amp;#10;#avmfejydrj .gt_footnote_marks {
font-size: 75%;
vertical-align: 0.4em;
position: initial;
}
&amp;#10;#avmfejydrj .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
&amp;#10;#avmfejydrj .gt_indent_1 {
text-indent: 5px;
}
&amp;#10;#avmfejydrj .gt_indent_2 {
text-indent: 10px;
}
&amp;#10;#avmfejydrj .gt_indent_3 {
text-indent: 15px;
}
&amp;#10;#avmfejydrj .gt_indent_4 {
text-indent: 20px;
}
&amp;#10;#avmfejydrj .gt_indent_5 {
text-indent: 25px;
}
&amp;#10;#avmfejydrj .katex-display {
display: inline-flex !important;
margin-bottom: 0.75em !important;
}
&amp;#10;#avmfejydrj div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
height: 0px !important;
}
&lt;/style>
&lt;table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
&lt;thead>
&lt;tr class="gt_heading">
&lt;td colspan="2" class="gt_heading gt_title gt_font_normal" style>Real Values Within 95% Credible Interval&lt;/td>
&lt;/tr>
&lt;tr class="gt_heading">
&lt;td colspan="2" class="gt_heading gt_subtitle gt_font_normal gt_bottom_border" style>Training Data Set&lt;/td>
&lt;/tr>
&lt;tr class="gt_col_headings">
&lt;th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="forecast_duration_names">Forecast Duration&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="q95">Percentage&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody class="gt_table_body">
&lt;tr>&lt;td headers="forecast_duration_names" class="gt_row gt_center">1 day&lt;/td>
&lt;td headers="q95" class="gt_row gt_center">96.6%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="forecast_duration_names" class="gt_row gt_center">2 day&lt;/td>
&lt;td headers="q95" class="gt_row gt_center">96.6%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="forecast_duration_names" class="gt_row gt_center">3 day&lt;/td>
&lt;td headers="q95" class="gt_row gt_center">95.8%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="forecast_duration_names" class="gt_row gt_center">4 day&lt;/td>
&lt;td headers="q95" class="gt_row gt_center">95.6%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="forecast_duration_names" class="gt_row gt_center">5 day&lt;/td>
&lt;td headers="q95" class="gt_row gt_center">95.6%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="forecast_duration_names" class="gt_row gt_center">6 day&lt;/td>
&lt;td headers="q95" class="gt_row gt_center">93.2%&lt;/td>&lt;/tr>
&lt;/tbody>
&amp;#10;
&lt;/table>
&lt;/div>
&lt;p>That’s good, at least the range of the posterior predictive distribution is close to our observed data. But it also shouldn’t be too surprising: the model was trained off this data.&lt;/p>
&lt;p>The real test of the model is how it performs on out of sample data. I’ve held out ~3 weeks of temperature/forecast accuracy data from the time period immediately after the training set. Each location is still using the jaggedness calculated from the training temperatures. Let’s take a look at how it performs:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2024-08-15-bom/index_files/figure-html/unnamed-chunk-26-1.png" width="672" />&lt;/p>
&lt;div id="arznasgvzw" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
&lt;style>#arznasgvzw table {
font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
&amp;#10;#arznasgvzw thead, #arznasgvzw tbody, #arznasgvzw tfoot, #arznasgvzw tr, #arznasgvzw td, #arznasgvzw th {
border-style: none;
}
&amp;#10;#arznasgvzw p {
margin: 0;
padding: 0;
}
&amp;#10;#arznasgvzw .gt_table {
display: table;
border-collapse: collapse;
line-height: normal;
margin-left: auto;
margin-right: auto;
color: #333333;
font-size: 16px;
font-weight: normal;
font-style: normal;
background-color: #FFFFFF;
width: auto;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #A8A8A8;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #A8A8A8;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
}
&amp;#10;#arznasgvzw .gt_caption {
padding-top: 4px;
padding-bottom: 4px;
}
&amp;#10;#arznasgvzw .gt_title {
color: #333333;
font-size: 125%;
font-weight: initial;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
border-bottom-color: #FFFFFF;
border-bottom-width: 0;
}
&amp;#10;#arznasgvzw .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
padding-top: 3px;
padding-bottom: 5px;
padding-left: 5px;
padding-right: 5px;
border-top-color: #FFFFFF;
border-top-width: 0;
}
&amp;#10;#arznasgvzw .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#arznasgvzw .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#arznasgvzw .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#arznasgvzw .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
padding-left: 5px;
padding-right: 5px;
overflow-x: hidden;
}
&amp;#10;#arznasgvzw .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
padding-top: 0;
padding-bottom: 0;
padding-left: 4px;
padding-right: 4px;
}
&amp;#10;#arznasgvzw .gt_column_spanner_outer:first-child {
padding-left: 0;
}
&amp;#10;#arznasgvzw .gt_column_spanner_outer:last-child {
padding-right: 0;
}
&amp;#10;#arznasgvzw .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 5px;
overflow-x: hidden;
display: inline-block;
width: 100%;
}
&amp;#10;#arznasgvzw .gt_spanner_row {
border-bottom-style: hidden;
}
&amp;#10;#arznasgvzw .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
text-align: left;
}
&amp;#10;#arznasgvzw .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: middle;
}
&amp;#10;#arznasgvzw .gt_from_md > :first-child {
margin-top: 0;
}
&amp;#10;#arznasgvzw .gt_from_md > :last-child {
margin-bottom: 0;
}
&amp;#10;#arznasgvzw .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
margin: 10px;
border-top-style: solid;
border-top-width: 1px;
border-top-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
overflow-x: hidden;
}
&amp;#10;#arznasgvzw .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#arznasgvzw .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
vertical-align: top;
}
&amp;#10;#arznasgvzw .gt_row_group_first td {
border-top-width: 2px;
}
&amp;#10;#arznasgvzw .gt_row_group_first th {
border-top-width: 2px;
}
&amp;#10;#arznasgvzw .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#arznasgvzw .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
&amp;#10;#arznasgvzw .gt_first_summary_row.thick {
border-top-width: 2px;
}
&amp;#10;#arznasgvzw .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#arznasgvzw .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#arznasgvzw .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: double;
border-top-width: 6px;
border-top-color: #D3D3D3;
}
&amp;#10;#arznasgvzw .gt_last_grand_summary_row_top {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: double;
border-bottom-width: 6px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#arznasgvzw .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
&amp;#10;#arznasgvzw .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#arznasgvzw .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#arznasgvzw .gt_footnote {
margin: 0px;
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#arznasgvzw .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#arznasgvzw .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#arznasgvzw .gt_left {
text-align: left;
}
&amp;#10;#arznasgvzw .gt_center {
text-align: center;
}
&amp;#10;#arznasgvzw .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
&amp;#10;#arznasgvzw .gt_font_normal {
font-weight: normal;
}
&amp;#10;#arznasgvzw .gt_font_bold {
font-weight: bold;
}
&amp;#10;#arznasgvzw .gt_font_italic {
font-style: italic;
}
&amp;#10;#arznasgvzw .gt_super {
font-size: 65%;
}
&amp;#10;#arznasgvzw .gt_footnote_marks {
font-size: 75%;
vertical-align: 0.4em;
position: initial;
}
&amp;#10;#arznasgvzw .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
&amp;#10;#arznasgvzw .gt_indent_1 {
text-indent: 5px;
}
&amp;#10;#arznasgvzw .gt_indent_2 {
text-indent: 10px;
}
&amp;#10;#arznasgvzw .gt_indent_3 {
text-indent: 15px;
}
&amp;#10;#arznasgvzw .gt_indent_4 {
text-indent: 20px;
}
&amp;#10;#arznasgvzw .gt_indent_5 {
text-indent: 25px;
}
&amp;#10;#arznasgvzw .katex-display {
display: inline-flex !important;
margin-bottom: 0.75em !important;
}
&amp;#10;#arznasgvzw div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
height: 0px !important;
}
&lt;/style>
&lt;table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
&lt;thead>
&lt;tr class="gt_heading">
&lt;td colspan="2" class="gt_heading gt_title gt_font_normal" style>Real Values Within 95% Credible Interval&lt;/td>
&lt;/tr>
&lt;tr class="gt_heading">
&lt;td colspan="2" class="gt_heading gt_subtitle gt_font_normal gt_bottom_border" style>Test (Out-of-Sample) Data Set&lt;/td>
&lt;/tr>
&lt;tr class="gt_col_headings">
&lt;th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="forecast_duration_names">Forecast Duration&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="q95">Percentage&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody class="gt_table_body">
&lt;tr>&lt;td headers="forecast_duration_names" class="gt_row gt_center">1 day&lt;/td>
&lt;td headers="q95" class="gt_row gt_center">93.2%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="forecast_duration_names" class="gt_row gt_center">2 day&lt;/td>
&lt;td headers="q95" class="gt_row gt_center">93.0%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="forecast_duration_names" class="gt_row gt_center">3 day&lt;/td>
&lt;td headers="q95" class="gt_row gt_center">91.8%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="forecast_duration_names" class="gt_row gt_center">4 day&lt;/td>
&lt;td headers="q95" class="gt_row gt_center">89.5%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="forecast_duration_names" class="gt_row gt_center">5 day&lt;/td>
&lt;td headers="q95" class="gt_row gt_center">89.2%&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="forecast_duration_names" class="gt_row gt_center">6 day&lt;/td>
&lt;td headers="q95" class="gt_row gt_center">85.9%&lt;/td>&lt;/tr>
&lt;/tbody>
&amp;#10;
&lt;/table>
&lt;/div>
&lt;p>Not bad, but not great. Outside of the three day forecast it starts to fall away.&lt;/p>
&lt;h1 id="summary">Summary&lt;/h1>
&lt;p>Phew, that one just kept going and going! A recap: we gathered 8 weeks of temperature and temperature forecasts from around Australia, and looked at how well the forecasts performed depending on how far out the forecast was. We then created a summary statistic - jaggedness - and attempted to create a linear model for predicting forecast accuracy based on this, as well as forecast duration. It did OK, but it’s probably a little simplistic, and perhaps not much use in the long term.&lt;/p>
&lt;p>A couple of weeks ago my wife asked me “why are you even doing this?”. Good question. I think there are two reasons:&lt;/p>
&lt;p>The first is that I just wanted to know the answer. How exactly does the BOM perform at forecasting temperature? Now we’ve got some data, and some outputs. It’s a small sample, and we don’t know how it changes over time, but it’s something we can point to. Whether it’s qualitatively good or bad I don’t know, I’ll leave those kind of judgements to the reader.&lt;/p>
&lt;p>The second reason is that, it doing these kinds posts, I get to put into practice things I’ve read about in an interesting way and consolidate a lot of knowledge. Here’s a few of the things I’ve got a better idea of now:&lt;/p>
&lt;ul>
&lt;li>The &lt;a href="https://readr.tidyverse.org/reference/read_fwf.html">read_fwf()&lt;/a> function in the readr package&lt;/li>
&lt;li>The &lt;a href="https://httr2.r-lib.org/reference/req_perform_parallel.html">req_perform_parallel()&lt;/a> function in the httr2 package for speeding up bulk requests to websites.&lt;/li>
&lt;li>Writing &lt;a href="https://wiki.archlinux.org/title/Systemd/Timers">systemd timers&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://www.dedoimedo.com/computers/linux-timezone.html">Timezone files&lt;/a> in Linux&lt;/li>
&lt;li>Using indexes for categorical variables in Stan models.&lt;/li>
&lt;li>Using a test set in a Stan model.&lt;/li>
&lt;/ul>
&lt;p>To me, those reasons are as good as any.&lt;/p></description></item><item><title>Byte-size: An Ode to Web Scraping with R</title><link>https://clt.blog.foletta.net/post/2025-03-04-bytesize-a-ode-to-rs-scraping/</link><pubDate>Mon, 03 Mar 2025 00:00:00 +0000</pubDate><guid>https://clt.blog.foletta.net/post/2025-03-04-bytesize-a-ode-to-rs-scraping/</guid><description>&lt;p>Last week I needed to pull some data from a website. I was building out a data pipeline to generate the end-of-season awards for Little Athletics, but I ran into a problem. Like dependable friend, R came to the rescue with a simple, elegant solution. This post is a ‘byte-size’ ode to this dependable friend.&lt;/p>
&lt;h1 id="beauty-and-terseness">Beauty and Terseness&lt;/h1>
&lt;p>I’ll get to the challenge I ran into shortly, but first we’ll take a look at what basic web scraping looks like. Suppose you want to get all the headlines from The Age’s website. You look at the source and see that all the &lt;code>&amp;lt;a&amp;gt;&lt;/code> tags have an attribute &lt;em>data-testid&lt;/em> equal to &lt;em>article-link&lt;/em>. Here’s the pipeline that acheives this:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#a6e22e">request&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;http://theage.com.au&amp;#39;&lt;/span>) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">req_perform&lt;/span>() &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">resp_body_html&lt;/span>() &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">html_elements&lt;/span>(xpath &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#34;//a[@data-testid=&amp;#39;article-link&amp;#39;]&amp;#34;&lt;/span>) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">html_text&lt;/span>() &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">tibble&lt;/span>(.name_repair &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#f92672">~&lt;/span>&lt;span style="color:#a6e22e">c&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;headline&amp;#39;&lt;/span>)) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">filter&lt;/span>(headline &lt;span style="color:#f92672">!=&lt;/span> &lt;span style="color:#e6db74">&amp;#34;&amp;#34;&lt;/span>) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">slice_head&lt;/span>(n &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">10&lt;/span>) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">gt&lt;/span>()
&lt;/code>&lt;/pre>&lt;/div>&lt;div id="zpyqsswiby" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
&lt;style>#zpyqsswiby table {
font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
&amp;#10;#zpyqsswiby thead, #zpyqsswiby tbody, #zpyqsswiby tfoot, #zpyqsswiby tr, #zpyqsswiby td, #zpyqsswiby th {
border-style: none;
}
&amp;#10;#zpyqsswiby p {
margin: 0;
padding: 0;
}
&amp;#10;#zpyqsswiby .gt_table {
display: table;
border-collapse: collapse;
line-height: normal;
margin-left: auto;
margin-right: auto;
color: #333333;
font-size: 16px;
font-weight: normal;
font-style: normal;
background-color: #FFFFFF;
width: auto;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #A8A8A8;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #A8A8A8;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
}
&amp;#10;#zpyqsswiby .gt_caption {
padding-top: 4px;
padding-bottom: 4px;
}
&amp;#10;#zpyqsswiby .gt_title {
color: #333333;
font-size: 125%;
font-weight: initial;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
border-bottom-color: #FFFFFF;
border-bottom-width: 0;
}
&amp;#10;#zpyqsswiby .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
padding-top: 3px;
padding-bottom: 5px;
padding-left: 5px;
padding-right: 5px;
border-top-color: #FFFFFF;
border-top-width: 0;
}
&amp;#10;#zpyqsswiby .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#zpyqsswiby .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#zpyqsswiby .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#zpyqsswiby .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
padding-left: 5px;
padding-right: 5px;
overflow-x: hidden;
}
&amp;#10;#zpyqsswiby .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
padding-top: 0;
padding-bottom: 0;
padding-left: 4px;
padding-right: 4px;
}
&amp;#10;#zpyqsswiby .gt_column_spanner_outer:first-child {
padding-left: 0;
}
&amp;#10;#zpyqsswiby .gt_column_spanner_outer:last-child {
padding-right: 0;
}
&amp;#10;#zpyqsswiby .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 5px;
overflow-x: hidden;
display: inline-block;
width: 100%;
}
&amp;#10;#zpyqsswiby .gt_spanner_row {
border-bottom-style: hidden;
}
&amp;#10;#zpyqsswiby .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
text-align: left;
}
&amp;#10;#zpyqsswiby .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: middle;
}
&amp;#10;#zpyqsswiby .gt_from_md > :first-child {
margin-top: 0;
}
&amp;#10;#zpyqsswiby .gt_from_md > :last-child {
margin-bottom: 0;
}
&amp;#10;#zpyqsswiby .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
margin: 10px;
border-top-style: solid;
border-top-width: 1px;
border-top-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
overflow-x: hidden;
}
&amp;#10;#zpyqsswiby .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#zpyqsswiby .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
vertical-align: top;
}
&amp;#10;#zpyqsswiby .gt_row_group_first td {
border-top-width: 2px;
}
&amp;#10;#zpyqsswiby .gt_row_group_first th {
border-top-width: 2px;
}
&amp;#10;#zpyqsswiby .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#zpyqsswiby .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
&amp;#10;#zpyqsswiby .gt_first_summary_row.thick {
border-top-width: 2px;
}
&amp;#10;#zpyqsswiby .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#zpyqsswiby .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#zpyqsswiby .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: double;
border-top-width: 6px;
border-top-color: #D3D3D3;
}
&amp;#10;#zpyqsswiby .gt_last_grand_summary_row_top {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: double;
border-bottom-width: 6px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#zpyqsswiby .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
&amp;#10;#zpyqsswiby .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#zpyqsswiby .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#zpyqsswiby .gt_footnote {
margin: 0px;
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#zpyqsswiby .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#zpyqsswiby .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#zpyqsswiby .gt_left {
text-align: left;
}
&amp;#10;#zpyqsswiby .gt_center {
text-align: center;
}
&amp;#10;#zpyqsswiby .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
&amp;#10;#zpyqsswiby .gt_font_normal {
font-weight: normal;
}
&amp;#10;#zpyqsswiby .gt_font_bold {
font-weight: bold;
}
&amp;#10;#zpyqsswiby .gt_font_italic {
font-style: italic;
}
&amp;#10;#zpyqsswiby .gt_super {
font-size: 65%;
}
&amp;#10;#zpyqsswiby .gt_footnote_marks {
font-size: 75%;
vertical-align: 0.4em;
position: initial;
}
&amp;#10;#zpyqsswiby .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
&amp;#10;#zpyqsswiby .gt_indent_1 {
text-indent: 5px;
}
&amp;#10;#zpyqsswiby .gt_indent_2 {
text-indent: 10px;
}
&amp;#10;#zpyqsswiby .gt_indent_3 {
text-indent: 15px;
}
&amp;#10;#zpyqsswiby .gt_indent_4 {
text-indent: 20px;
}
&amp;#10;#zpyqsswiby .gt_indent_5 {
text-indent: 25px;
}
&amp;#10;#zpyqsswiby .katex-display {
display: inline-flex !important;
margin-bottom: 0.75em !important;
}
&amp;#10;#zpyqsswiby div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
height: 0px !important;
}
&lt;/style>
&lt;table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
&lt;thead>
&lt;tr class="gt_col_headings">
&lt;th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1" scope="col" id="headline">headline&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody class="gt_table_body">
&lt;tr>&lt;td headers="headline" class="gt_row gt_left">Target Time&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="headline" class="gt_row gt_left">Get 2-for-1 Comedy Festival tickets*&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="headline" class="gt_row gt_left">The Morning Edition podcast&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="headline" class="gt_row gt_left">The 100 most expensive Melbourne public schools revealed&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="headline" class="gt_row gt_left">$4b on a new station in Melbourne’s west – has Victoria lost its budgetary mind?&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="headline" class="gt_row gt_left">Daily atrocities spew out of the Oval Office. It’s still not enough to lure Australian voters back to the known&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="headline" class="gt_row gt_left">Trump administration upends US foreign policy, holds secret talks with Hamas&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="headline" class="gt_row gt_left">Trump’s speech was full of wild claims. Here are seven that weren’t true&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="headline" class="gt_row gt_left">US lands new blow on Ukraine after Oval Office stoush&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="headline" class="gt_row gt_left">The number one reason why people are leaving Melbourne&lt;/td>&lt;/tr>
&lt;/tbody>
&amp;#10;
&lt;/table>
&lt;/div>
&lt;p>There we go, five lines of R and you’ve got the headlines, plus a couple more to get it into a nicer structure. What makes it so simple? I think it comes down to two things: number one is R’s pipe operator, which menas you don’t have to pepper your code with temporary variables. Second is R’s vectorisation, which means you don’t need to worry about any loops. I also tip my hat to the relatively new &lt;a href="https://httr2.r-lib.org/">httr2&lt;/a> package which makes web requests fit much better into a pipeline.&lt;/p>
&lt;h1 id="the-challenge">The Challenge&lt;/h1>
&lt;p>The challenge I ran into last week was that, while the data I needed was structured, it wasn’t in HTML, XML, or even JSON, it was actually JavaScript. Here’s an abridged sample of what was returned in one of the API calls:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-json" data-lang="json">&lt;span style="color:#960050;background-color:#1e0010">sessions_NMRKeilor&lt;/span> &lt;span style="color:#960050;background-color:#1e0010">=&lt;/span> [
{&lt;span style="color:#f92672">&amp;#34;SessNbr&amp;#34;&lt;/span>:&lt;span style="color:#e6db74">&amp;#34;1&amp;#34;&lt;/span>,&lt;span style="color:#f92672">&amp;#34;SessPtr&amp;#34;&lt;/span>:&lt;span style="color:#e6db74">&amp;#34;24&amp;#34;&lt;/span>,&lt;span style="color:#f92672">&amp;#34;SessName&amp;#34;&lt;/span>:&lt;span style="color:#e6db74">&amp;#34;Sat Morning - Field&amp;#34;&lt;/span>,&lt;span style="color:#f92672">&amp;#34;SessDay&amp;#34;&lt;/span>:&lt;span style="color:#e6db74">&amp;#34;1&amp;#34;&lt;/span>,&lt;span style="color:#f92672">&amp;#34;SessTime&amp;#34;&lt;/span>:&lt;span style="color:#e6db74">&amp;#34;30600&amp;#34;&lt;/span>},
{&lt;span style="color:#f92672">&amp;#34;SessNbr&amp;#34;&lt;/span>:&lt;span style="color:#e6db74">&amp;#34;2&amp;#34;&lt;/span>,&lt;span style="color:#f92672">&amp;#34;SessPtr&amp;#34;&lt;/span>:&lt;span style="color:#e6db74">&amp;#34;25&amp;#34;&lt;/span>,&lt;span style="color:#f92672">&amp;#34;SessName&amp;#34;&lt;/span>:&lt;span style="color:#e6db74">&amp;#34;Sat Morning - Track&amp;#34;&lt;/span>,&lt;span style="color:#f92672">&amp;#34;SessDay&amp;#34;&lt;/span>:&lt;span style="color:#e6db74">&amp;#34;1&amp;#34;&lt;/span>,&lt;span style="color:#f92672">&amp;#34;SessTime&amp;#34;&lt;/span>:&lt;span style="color:#e6db74">&amp;#34;32400&amp;#34;&lt;/span>},
{&lt;span style="color:#f92672">&amp;#34;SessNbr&amp;#34;&lt;/span>:&lt;span style="color:#e6db74">&amp;#34;3&amp;#34;&lt;/span>,&lt;span style="color:#f92672">&amp;#34;SessPtr&amp;#34;&lt;/span>:&lt;span style="color:#e6db74">&amp;#34;46&amp;#34;&lt;/span>,&lt;span style="color:#f92672">&amp;#34;SessName&amp;#34;&lt;/span>:&lt;span style="color:#e6db74">&amp;#34;Sat Afternoon - Field&amp;#34;&lt;/span>,&lt;span style="color:#f92672">&amp;#34;SessDay&amp;#34;&lt;/span>:&lt;span style="color:#e6db74">&amp;#34;1&amp;#34;&lt;/span>,&lt;span style="color:#f92672">&amp;#34;SessTime&amp;#34;&lt;/span>:&lt;span style="color:#e6db74">&amp;#34;46800&amp;#34;&lt;/span>},
&lt;span style="color:#960050;background-color:#1e0010">...&lt;/span>
]
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Not sure what to do, I fetch the data:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">js_content &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
&lt;span style="color:#a6e22e">request&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;https://lavic.resultshub.com.au/php/resultsFileFetch.php?season=2024&amp;amp;series=regions&amp;amp;round=3&amp;amp;venue=undefined&amp;#39;&lt;/span>) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">req_perform&lt;/span>() &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">resp_body_string&lt;/span>()
&lt;/code>&lt;/pre>&lt;/div>&lt;p>The mind initially goes to dark places: can I solve this with a regex? Maybe filter out the variable assignment portions and parse as JSON? Pulling myself together, I think “this is just JavaScript, is there a way I can simply evaluate it?”. Some research shows that there’s an R library that provides API access into Google’s V8 JavaScript implementation. This should allow me to evaluate the JavaScript code we received:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">jscontext &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">v8&lt;/span>()
jscontext&lt;span style="color:#f92672">$&lt;/span>&lt;span style="color:#a6e22e">eval&lt;/span>(js_content)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>From there I can get the variable I need, and the library converts this from JSON to a clean data frame:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">jscontext&lt;span style="color:#f92672">$&lt;/span>&lt;span style="color:#a6e22e">get&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;sessions_NMRKeilor&amp;#39;&lt;/span>) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">gt&lt;/span>()
&lt;/code>&lt;/pre>&lt;/div>&lt;div id="siumjblpqx" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
&lt;style>#siumjblpqx table {
font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
&amp;#10;#siumjblpqx thead, #siumjblpqx tbody, #siumjblpqx tfoot, #siumjblpqx tr, #siumjblpqx td, #siumjblpqx th {
border-style: none;
}
&amp;#10;#siumjblpqx p {
margin: 0;
padding: 0;
}
&amp;#10;#siumjblpqx .gt_table {
display: table;
border-collapse: collapse;
line-height: normal;
margin-left: auto;
margin-right: auto;
color: #333333;
font-size: 16px;
font-weight: normal;
font-style: normal;
background-color: #FFFFFF;
width: auto;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #A8A8A8;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #A8A8A8;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
}
&amp;#10;#siumjblpqx .gt_caption {
padding-top: 4px;
padding-bottom: 4px;
}
&amp;#10;#siumjblpqx .gt_title {
color: #333333;
font-size: 125%;
font-weight: initial;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
border-bottom-color: #FFFFFF;
border-bottom-width: 0;
}
&amp;#10;#siumjblpqx .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
padding-top: 3px;
padding-bottom: 5px;
padding-left: 5px;
padding-right: 5px;
border-top-color: #FFFFFF;
border-top-width: 0;
}
&amp;#10;#siumjblpqx .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#siumjblpqx .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#siumjblpqx .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#siumjblpqx .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
padding-left: 5px;
padding-right: 5px;
overflow-x: hidden;
}
&amp;#10;#siumjblpqx .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
padding-top: 0;
padding-bottom: 0;
padding-left: 4px;
padding-right: 4px;
}
&amp;#10;#siumjblpqx .gt_column_spanner_outer:first-child {
padding-left: 0;
}
&amp;#10;#siumjblpqx .gt_column_spanner_outer:last-child {
padding-right: 0;
}
&amp;#10;#siumjblpqx .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 5px;
overflow-x: hidden;
display: inline-block;
width: 100%;
}
&amp;#10;#siumjblpqx .gt_spanner_row {
border-bottom-style: hidden;
}
&amp;#10;#siumjblpqx .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
text-align: left;
}
&amp;#10;#siumjblpqx .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: middle;
}
&amp;#10;#siumjblpqx .gt_from_md > :first-child {
margin-top: 0;
}
&amp;#10;#siumjblpqx .gt_from_md > :last-child {
margin-bottom: 0;
}
&amp;#10;#siumjblpqx .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
margin: 10px;
border-top-style: solid;
border-top-width: 1px;
border-top-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
overflow-x: hidden;
}
&amp;#10;#siumjblpqx .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#siumjblpqx .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
vertical-align: top;
}
&amp;#10;#siumjblpqx .gt_row_group_first td {
border-top-width: 2px;
}
&amp;#10;#siumjblpqx .gt_row_group_first th {
border-top-width: 2px;
}
&amp;#10;#siumjblpqx .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#siumjblpqx .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
&amp;#10;#siumjblpqx .gt_first_summary_row.thick {
border-top-width: 2px;
}
&amp;#10;#siumjblpqx .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#siumjblpqx .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#siumjblpqx .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: double;
border-top-width: 6px;
border-top-color: #D3D3D3;
}
&amp;#10;#siumjblpqx .gt_last_grand_summary_row_top {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: double;
border-bottom-width: 6px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#siumjblpqx .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
&amp;#10;#siumjblpqx .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#siumjblpqx .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#siumjblpqx .gt_footnote {
margin: 0px;
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#siumjblpqx .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#siumjblpqx .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#siumjblpqx .gt_left {
text-align: left;
}
&amp;#10;#siumjblpqx .gt_center {
text-align: center;
}
&amp;#10;#siumjblpqx .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
&amp;#10;#siumjblpqx .gt_font_normal {
font-weight: normal;
}
&amp;#10;#siumjblpqx .gt_font_bold {
font-weight: bold;
}
&amp;#10;#siumjblpqx .gt_font_italic {
font-style: italic;
}
&amp;#10;#siumjblpqx .gt_super {
font-size: 65%;
}
&amp;#10;#siumjblpqx .gt_footnote_marks {
font-size: 75%;
vertical-align: 0.4em;
position: initial;
}
&amp;#10;#siumjblpqx .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
&amp;#10;#siumjblpqx .gt_indent_1 {
text-indent: 5px;
}
&amp;#10;#siumjblpqx .gt_indent_2 {
text-indent: 10px;
}
&amp;#10;#siumjblpqx .gt_indent_3 {
text-indent: 15px;
}
&amp;#10;#siumjblpqx .gt_indent_4 {
text-indent: 20px;
}
&amp;#10;#siumjblpqx .gt_indent_5 {
text-indent: 25px;
}
&amp;#10;#siumjblpqx .katex-display {
display: inline-flex !important;
margin-bottom: 0.75em !important;
}
&amp;#10;#siumjblpqx div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
height: 0px !important;
}
&lt;/style>
&lt;table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
&lt;thead>
&lt;tr class="gt_col_headings">
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="SessNbr">SessNbr&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="SessPtr">SessPtr&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1" scope="col" id="SessName">SessName&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="SessDay">SessDay&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="SessTime">SessTime&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody class="gt_table_body">
&lt;tr>&lt;td headers="SessNbr" class="gt_row gt_right">1&lt;/td>
&lt;td headers="SessPtr" class="gt_row gt_right">24&lt;/td>
&lt;td headers="SessName" class="gt_row gt_left">Sat Morning - Field&lt;/td>
&lt;td headers="SessDay" class="gt_row gt_right">1&lt;/td>
&lt;td headers="SessTime" class="gt_row gt_right">30600&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="SessNbr" class="gt_row gt_right">2&lt;/td>
&lt;td headers="SessPtr" class="gt_row gt_right">25&lt;/td>
&lt;td headers="SessName" class="gt_row gt_left">Sat Morning - Track&lt;/td>
&lt;td headers="SessDay" class="gt_row gt_right">1&lt;/td>
&lt;td headers="SessTime" class="gt_row gt_right">32400&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="SessNbr" class="gt_row gt_right">3&lt;/td>
&lt;td headers="SessPtr" class="gt_row gt_right">46&lt;/td>
&lt;td headers="SessName" class="gt_row gt_left">Sat Afternoon - Field&lt;/td>
&lt;td headers="SessDay" class="gt_row gt_right">1&lt;/td>
&lt;td headers="SessTime" class="gt_row gt_right">46800&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="SessNbr" class="gt_row gt_right">4&lt;/td>
&lt;td headers="SessPtr" class="gt_row gt_right">30&lt;/td>
&lt;td headers="SessName" class="gt_row gt_left">Sat Afternoon - Track&lt;/td>
&lt;td headers="SessDay" class="gt_row gt_right">1&lt;/td>
&lt;td headers="SessTime" class="gt_row gt_right">46800&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="SessNbr" class="gt_row gt_right">5&lt;/td>
&lt;td headers="SessPtr" class="gt_row gt_right">43&lt;/td>
&lt;td headers="SessName" class="gt_row gt_left">Sun Morning - Field&lt;/td>
&lt;td headers="SessDay" class="gt_row gt_right">2&lt;/td>
&lt;td headers="SessTime" class="gt_row gt_right">30600&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="SessNbr" class="gt_row gt_right">6&lt;/td>
&lt;td headers="SessPtr" class="gt_row gt_right">38&lt;/td>
&lt;td headers="SessName" class="gt_row gt_left">Sun Morning - Track&lt;/td>
&lt;td headers="SessDay" class="gt_row gt_right">2&lt;/td>
&lt;td headers="SessTime" class="gt_row gt_right">30600&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="SessNbr" class="gt_row gt_right">7&lt;/td>
&lt;td headers="SessPtr" class="gt_row gt_right">47&lt;/td>
&lt;td headers="SessName" class="gt_row gt_left">Sun Afternoon - Field&lt;/td>
&lt;td headers="SessDay" class="gt_row gt_right">2&lt;/td>
&lt;td headers="SessTime" class="gt_row gt_right">46800&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="SessNbr" class="gt_row gt_right">8&lt;/td>
&lt;td headers="SessPtr" class="gt_row gt_right">45&lt;/td>
&lt;td headers="SessName" class="gt_row gt_left">Sun Afternoon - Track&lt;/td>
&lt;td headers="SessDay" class="gt_row gt_right">2&lt;/td>
&lt;td headers="SessTime" class="gt_row gt_right">47700&lt;/td>&lt;/tr>
&lt;/tbody>
&amp;#10;
&lt;/table>
&lt;/div>
&lt;p>That’s it: web request to JavaScript evaluation to structured R data in a few lines; a thing of beauty.&lt;/p></description></item><item><title>A Day in the Life: The Global BGP Table</title><link>https://clt.blog.foletta.net/post/2024-01-08-a-day-in-the-life-the-bgp-table/</link><pubDate>Tue, 12 Nov 2024 00:00:00 +0000</pubDate><guid>https://clt.blog.foletta.net/post/2024-01-08-a-day-in-the-life-the-bgp-table/</guid><description>&lt;p>&lt;strong>Update:&lt;/strong> &lt;a href="https://news.ycombinator.com/item?id=42233565">This article was discussed on Hackernews&lt;/a>&lt;/p>
&lt;p>Much has been written and a lot of analysis performed on the global BGP table over the years, a significant portion by the inimitable &lt;a href="https://bgp.potaroo.net/">Geoff Huston&lt;/a>. However this often focuses on is long term trends, like the growth of the routing table or the adoption of IPv6 , dealing with time frames of of months or years.&lt;/p>
&lt;p>I was more interested in what was happening in the short term: what does it look like on the front line for those poor routers connected to the churning, foamy chaos of the interenet, trying their best to adhere to &lt;a href="https://en.wikipedia.org/wiki/Robustness_principle">Postel’s Law&lt;/a>? What we’ll look at in this article is “a day in the life of the global BGP table”, exploring the intra-day shenanigans with an eye to finding some of the ridiculous things that go on out.&lt;/p>
&lt;p>We’ll focus in on three key areas:&lt;/p>
&lt;ul>
&lt;li>General behaviour over the course of the day&lt;/li>
&lt;li>Outlier path attributes&lt;/li>
&lt;li>Flappy paths&lt;/li>
&lt;/ul>
&lt;p>As you’ll see, we end up with more questions than answers, but I think that’s the hallmark of good exploratory work. Let’s dive in.&lt;/p>
&lt;h1 id="let-the-yak-shaving-begin">Let the Yak Shaving Begin&lt;/h1>
&lt;p>The first step, as always, is to get some data to work with. Parsing the debug outputs from various routers seemed like a recipe for disaster, so instead I did a little yak-shaving. I went back to a half-finished project BGP daemon I’d started writing years ago and got it into a working state. The result is &lt;strong>&lt;a href="https://github.com/gregfoletta/bgpsee">bgpsee&lt;/a>&lt;/strong>, a multi-threaded BGP peering tool for the CLI. Once peered with another router, all the BGP messages - OPENs, KEEPALIVES, and most importantly UPDATEs - are parsed and output as JSON.&lt;/p>
&lt;p>For example, heres one of the BGP updates from the dataset we’re working with in this article:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-json" data-lang="json">{
&lt;span style="color:#f92672">&amp;#34;recv_time&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">1704483075&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;id&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">12349&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;UPDATE&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;nlri&amp;#34;&lt;/span>: [ &lt;span style="color:#e6db74">&amp;#34;38.43.124.0/23&amp;#34;&lt;/span> ],
&lt;span style="color:#f92672">&amp;#34;withdrawn_routes&amp;#34;&lt;/span>: [],
&lt;span style="color:#f92672">&amp;#34;path_attributes&amp;#34;&lt;/span>: [
{
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;ORIGIN&amp;#34;&lt;/span>, &lt;span style="color:#f92672">&amp;#34;type_code&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">1&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;origin&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;IGP&amp;#34;&lt;/span>
},
{
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;AS_PATH&amp;#34;&lt;/span>, &lt;span style="color:#f92672">&amp;#34;type_code&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">2&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;n_as_segments&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">1&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;path_segments&amp;#34;&lt;/span>: [
{
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;AS_SEQUENCE&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;n_as&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">6&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;asns&amp;#34;&lt;/span>: [ &lt;span style="color:#ae81ff">45270&lt;/span>, &lt;span style="color:#ae81ff">4764&lt;/span>, &lt;span style="color:#ae81ff">2914&lt;/span>, &lt;span style="color:#ae81ff">12956&lt;/span>, &lt;span style="color:#ae81ff">27951&lt;/span>, &lt;span style="color:#ae81ff">23456&lt;/span> ]
}
]
},
{
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;NEXT_HOP&amp;#34;&lt;/span>, &lt;span style="color:#f92672">&amp;#34;type_code&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">3&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;next_hop&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;61.245.147.114&amp;#34;&lt;/span>
},
{
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;AS4_PATH&amp;#34;&lt;/span>, &lt;span style="color:#f92672">&amp;#34;type_code&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">17&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;n_as_segments&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">1&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;path_segments&amp;#34;&lt;/span>: [
{
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;AS_SEQUENCE&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;n_as&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">6&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;asns&amp;#34;&lt;/span>: [ &lt;span style="color:#ae81ff">45270&lt;/span>,&lt;span style="color:#ae81ff">4764&lt;/span>, &lt;span style="color:#ae81ff">2914&lt;/span>, &lt;span style="color:#ae81ff">12956&lt;/span>, &lt;span style="color:#ae81ff">27951&lt;/span>, &lt;span style="color:#ae81ff">273013&lt;/span> ]
}
]
}
]
}
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Collected between 6/1/2024 and 7/1/2024, the full dataset consists of 464,673 BGP UPDATE messages received from a peer (many thanks to &lt;a href="https://www.linkedin.com/in/andrew-vinton/">Andrew Vinton&lt;/a>) with a full BGP table. Let’s take a look at how this full table behaves over the course of the day.&lt;/p>
&lt;h1 id="initial-send-number-of-v4-and-v6-paths">Initial Send, Number of v4 and v6 Paths&lt;/h1>
&lt;p>When you first bring up a BGP peering with a router you get a big dump of of UPDATEs, what I’ll call the ‘first tranche’. It consists of all paths and associated network layer reachability information (NLRI, or more simply ‘routes’) in the router’s BGP table. After this first tranche the peering only receives UPDATEs for paths that have changed, or withdrawn routes which no longer have any paths. There’s no structural difference between the first tranche and the subsequent UPDATEs, except for the fact you received the first batch in the first 5 or so seconds of the peering coming up.&lt;/p>
&lt;p>Here’s a breakdown of the number of distinct paths received in that first tranche, separated by IP version:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2024-01-08-a-day-in-the-life-the-bgp-table/index_files/figure-html/unnamed-chunk-7-1.png" width="672" />
It’s important to highlight that this is a count of BGP paths, &lt;strong>not&lt;/strong> routes. Each path is a unique combination of path attributes with associated NLRI information attached, sent in a distinct BGP UPDATE message. There could be one, or one-thousand routes associated with each path. In this first tranche the total number of routes across all of these paths is 949483.&lt;/p>
&lt;h1 id="a-garden-hose-or-a-fire-hose">A Garden Hose or a Fire Hose?&lt;/h1>
&lt;p>That’s all we’ll look at in the first tranche, we’ll focus our attention from this point on to the rest of the updates received across the day. The updates aren’t sent as a real-time stream, but in bunches based on the &lt;a href="https://datatracker.ietf.org/doc/html/rfc4271#section-10">Route Advertisement Interval&lt;/a> timer, which for this peering was 30 seconds. Here’s a time-series view of the number of updates received during the course of the day:&lt;/p>
&lt;p>&lt;img src="index_files/figure-html/unnamed-chunk-8-1.gif" alt="">&lt;!-- -->
For IPv4 paths you’re looking on average at around 50 path updates every 30 seconds. For IPv6 it’s slightly lower, at around 47 path updates. While the averages are close, the variance is quite different, a standard deviation of 64.3 and 43 for v4 and v6 respectively.&lt;/p>
&lt;p>Instead of looking at the total count of updates, we can instead look at the total aggregate IP address change. We do this by adding up the total amount of IP addresses across all updates for every 30 second interval, then take the log2() of the sum. So for example: a /22, a /23 and a /24 would be \(log_2(2^{32-22} + 2^{32-23} + 2^{32-24})\)&lt;/p>
&lt;p>Below is the log2() IPv4 address space, viewed as a time series and as a density plot. It shows that on average, every 30 seconds, around 2^16 IP addresses (i.e a /16) change paths in the global routing table, with 95% of time time the change in IP address space is between \(2^{20.75}\) (approx. a /11) and \(2^{13.85}\) (approx. a /18).&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2024-01-08-a-day-in-the-life-the-bgp-table/index_files/figure-html/unnamed-chunk-9-1.png" width="672" />&lt;/p>
&lt;p>What is apparent in both the path and IP space changes over time is that there is some sort of cyclic behaviour in the IPv4 updates. To determine the period of this cycle we can use an &lt;a href="https://otexts.com/fpp3/acf.html">ACF&lt;/a> or autocorrelation plot. We calculate the correlation between the number of paths received at time \(y_t\) versus the number received at \(y_{t-{1,t-2,…,t-n}}\) lags. I’ve grouped the updates together into 1 minute intervals, so 1 lag = 1 minute.&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2024-01-08-a-day-in-the-life-the-bgp-table/index_files/figure-html/unnamed-chunk-10-1.png" width="672" />
There is a strong correlation in the first 7 or so lags, which intuitively makes sense to me as path changes can create other path changes as they propagate around the world. But there also appears to be strong correlation at lags 40 and 41, indicating some cyclic behaviour every forty minutes. This gives us the first question which I’ll leave unanswered:&lt;/p>
&lt;ul>
&lt;li>&lt;em>What is causing the global IPv4 BGP table have a 40 minute cycle?&lt;/em>.&lt;/li>
&lt;/ul>
&lt;h1 id="prepending-madness">Prepending Madness&lt;/h1>
&lt;p>If you’re a network admin, there’s a couple of different ways you can influence how traffic enters your ASN. You can use longer network prefixes, but this doesn’t scale well and you’re not being a polite BGP citizen. You can use the MED attribute, but it’s non-transitive so it doesn’t work if you’re peered to multiple AS. The usual go-to is to modify the AS path length by prepending your own AS one or more times to certain peers, making that path less preferable.&lt;/p>
&lt;p>In chaos of the global routing table, some people take this prepending too far. This has in the past caused &lt;a href="https://blog.ipspace.net/2009/02/root-cause-analysis-oversized-as-paths/">large, global problems&lt;/a>. Let’s take a look at the top 50 AS path lengths for IPv4 and IPv6 updates respectively:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2024-01-08-a-day-in-the-life-the-bgp-table/index_files/figure-html/unnamed-chunk-11-1.png" width="672" />
What stands out is the difference between IPv4 and IPv6. The largest IPv4 path length is 105, which is still pretty ridiculous given the fact that the largest non-prepended path in this dataset has a length of 14. But compared to the IPv6 paths it’s outright sensible: top of the table for IPv6 comes in at a whopping 599 ASes! An AS path is actually made up of one or more &lt;a href="https://datatracker.ietf.org/doc/html/rfc4271#section-5.1.2">AS sets or AS sequences&lt;/a>, each of which have a maximum length of 255. So it’s taken three AS sequences to announce those routes.&lt;/p>
&lt;p>Here’s the longest IPv4 path in all it’s glory with its 105 ASNs. It originated from AS149381 “Dinas Komunikasi dan Informatika Kabupaten Tulungagung” in Indonesia.&lt;/p>
&lt;pre>&lt;code>[1] &amp;quot;45270 4764 9002 136106 45305 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381 149381&amp;quot;
&lt;/code>&lt;/pre>
&lt;p>We see that around 6 hours and 50 minutes later they realise the error in their ways and announce a path with only four ASes, rather than 105:&lt;/p>
&lt;div id="xqhmswhyyp" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
&lt;style>#xqhmswhyyp table {
font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
&amp;#10;#xqhmswhyyp thead, #xqhmswhyyp tbody, #xqhmswhyyp tfoot, #xqhmswhyyp tr, #xqhmswhyyp td, #xqhmswhyyp th {
border-style: none;
}
&amp;#10;#xqhmswhyyp p {
margin: 0;
padding: 0;
}
&amp;#10;#xqhmswhyyp .gt_table {
display: table;
border-collapse: collapse;
line-height: normal;
margin-left: auto;
margin-right: auto;
color: #333333;
font-size: 16px;
font-weight: normal;
font-style: normal;
background-color: #FFFFFF;
width: auto;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #A8A8A8;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #A8A8A8;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
}
&amp;#10;#xqhmswhyyp .gt_caption {
padding-top: 4px;
padding-bottom: 4px;
}
&amp;#10;#xqhmswhyyp .gt_title {
color: #333333;
font-size: 125%;
font-weight: initial;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
border-bottom-color: #FFFFFF;
border-bottom-width: 0;
}
&amp;#10;#xqhmswhyyp .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
padding-top: 3px;
padding-bottom: 5px;
padding-left: 5px;
padding-right: 5px;
border-top-color: #FFFFFF;
border-top-width: 0;
}
&amp;#10;#xqhmswhyyp .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#xqhmswhyyp .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#xqhmswhyyp .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#xqhmswhyyp .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
padding-left: 5px;
padding-right: 5px;
overflow-x: hidden;
}
&amp;#10;#xqhmswhyyp .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
padding-top: 0;
padding-bottom: 0;
padding-left: 4px;
padding-right: 4px;
}
&amp;#10;#xqhmswhyyp .gt_column_spanner_outer:first-child {
padding-left: 0;
}
&amp;#10;#xqhmswhyyp .gt_column_spanner_outer:last-child {
padding-right: 0;
}
&amp;#10;#xqhmswhyyp .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 5px;
overflow-x: hidden;
display: inline-block;
width: 100%;
}
&amp;#10;#xqhmswhyyp .gt_spanner_row {
border-bottom-style: hidden;
}
&amp;#10;#xqhmswhyyp .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
text-align: left;
}
&amp;#10;#xqhmswhyyp .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: middle;
}
&amp;#10;#xqhmswhyyp .gt_from_md > :first-child {
margin-top: 0;
}
&amp;#10;#xqhmswhyyp .gt_from_md > :last-child {
margin-bottom: 0;
}
&amp;#10;#xqhmswhyyp .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
margin: 10px;
border-top-style: solid;
border-top-width: 1px;
border-top-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
overflow-x: hidden;
}
&amp;#10;#xqhmswhyyp .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#xqhmswhyyp .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
vertical-align: top;
}
&amp;#10;#xqhmswhyyp .gt_row_group_first td {
border-top-width: 2px;
}
&amp;#10;#xqhmswhyyp .gt_row_group_first th {
border-top-width: 2px;
}
&amp;#10;#xqhmswhyyp .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#xqhmswhyyp .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
&amp;#10;#xqhmswhyyp .gt_first_summary_row.thick {
border-top-width: 2px;
}
&amp;#10;#xqhmswhyyp .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#xqhmswhyyp .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#xqhmswhyyp .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: double;
border-top-width: 6px;
border-top-color: #D3D3D3;
}
&amp;#10;#xqhmswhyyp .gt_last_grand_summary_row_top {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: double;
border-bottom-width: 6px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#xqhmswhyyp .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
&amp;#10;#xqhmswhyyp .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#xqhmswhyyp .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#xqhmswhyyp .gt_footnote {
margin: 0px;
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#xqhmswhyyp .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#xqhmswhyyp .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#xqhmswhyyp .gt_left {
text-align: left;
}
&amp;#10;#xqhmswhyyp .gt_center {
text-align: center;
}
&amp;#10;#xqhmswhyyp .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
&amp;#10;#xqhmswhyyp .gt_font_normal {
font-weight: normal;
}
&amp;#10;#xqhmswhyyp .gt_font_bold {
font-weight: bold;
}
&amp;#10;#xqhmswhyyp .gt_font_italic {
font-style: italic;
}
&amp;#10;#xqhmswhyyp .gt_super {
font-size: 65%;
}
&amp;#10;#xqhmswhyyp .gt_footnote_marks {
font-size: 75%;
vertical-align: 0.4em;
position: initial;
}
&amp;#10;#xqhmswhyyp .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
&amp;#10;#xqhmswhyyp .gt_indent_1 {
text-indent: 5px;
}
&amp;#10;#xqhmswhyyp .gt_indent_2 {
text-indent: 10px;
}
&amp;#10;#xqhmswhyyp .gt_indent_3 {
text-indent: 15px;
}
&amp;#10;#xqhmswhyyp .gt_indent_4 {
text-indent: 20px;
}
&amp;#10;#xqhmswhyyp .gt_indent_5 {
text-indent: 25px;
}
&amp;#10;#xqhmswhyyp .katex-display {
display: inline-flex !important;
margin-bottom: 0.75em !important;
}
&amp;#10;#xqhmswhyyp div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
height: 0px !important;
}
&lt;/style>
&lt;table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
&lt;thead>
&lt;tr class="gt_col_headings">
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="recv_time">recv_time&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="time_difference">time_difference&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="id">id&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="as_path_length">as_path_length&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1" scope="col" id="type">type&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="nlri">nlri&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody class="gt_table_body">
&lt;tr>&lt;td headers="recv_time" class="gt_row gt_right">2024-01-06 06:31:18&lt;/td>
&lt;td headers="time_difference" class="gt_row gt_center">NA&lt;/td>
&lt;td headers="id" class="gt_row gt_right">66121&lt;/td>
&lt;td headers="as_path_length" class="gt_row gt_right">105&lt;/td>
&lt;td headers="type" class="gt_row gt_left">UPDATE&lt;/td>
&lt;td headers="nlri" class="gt_row gt_right">103.179.250.0/24&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="recv_time" class="gt_row gt_right">2024-01-06 13:21:35&lt;/td>
&lt;td headers="time_difference" class="gt_row gt_center">6.84&lt;/td>
&lt;td headers="id" class="gt_row gt_right">280028&lt;/td>
&lt;td headers="as_path_length" class="gt_row gt_right">4&lt;/td>
&lt;td headers="type" class="gt_row gt_left">UPDATE&lt;/td>
&lt;td headers="nlri" class="gt_row gt_right">103.179.250.0/24&lt;/td>&lt;/tr>
&lt;/tbody>
&amp;#10;
&lt;/table>
&lt;/div>
&lt;p>Here’s the largest IPv6 path, with its mammoth 599 prefixes; I’ll let you enjoy scrolling to the right on this one:&lt;/p>
&lt;pre>&lt;code>[1] &amp;quot;45270 4764 2914 29632 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 8772 200579 200579 203868&amp;quot;
&lt;/code>&lt;/pre>
&lt;p>Interestingly it’s not the originator that’s prepending, but as8772 ‘NetAssist LLC’, an ISP out of Ukraine prepending to make paths to asn203868 (Rifqi Arief Pamungkas, again out of Indonesia) less preferable.&lt;/p>
&lt;p>Why is there such a difference between the largest IPv4 and IPv6 path lengths? I had a couple of different theories, but then looked at the total number of ASNs in &lt;em>all&lt;/em> positions for those top 50 longest paths, and it became apparent what was happening:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2024-01-08-a-day-in-the-life-the-bgp-table/index_files/figure-html/unnamed-chunk-15-1.png" width="672" />
Looks like they let the junior network admin at NetAssist on to the tools too early!&lt;/p>
&lt;h1 id="path-attributes">Path Attributes&lt;/h1>
&lt;p>Each BGP update consist of network layer reachability information (routes) and path attributes. For example AS_PATH, NEXT_HOP, etc. There are four kinds of attributes:&lt;/p>
&lt;ol>
&lt;li>Well-known mandatory&lt;/li>
&lt;li>Well-known discretionary&lt;/li>
&lt;li>Optional transitive&lt;/li>
&lt;li>Optional non-transitive&lt;/li>
&lt;/ol>
&lt;p>&lt;a href="https://datatracker.ietf.org/doc/html/rfc4271#section-5">Section 5&lt;/a> of RFC4271 has a good description of all of these.&lt;/p>
&lt;p>What we can do is take a look at the number of attributes we’ve seen across all of our IPv4 paths, placing this on on a log scale to make it easier to view:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2024-01-08-a-day-in-the-life-the-bgp-table/index_files/figure-html/unnamed-chunk-17-1.png" width="672" />&lt;/p>
&lt;p>The well-known mandatory attributes, ORIGIN, NEXT_HOP, and AS_PATH, are present in all updates, and have the same counts. There’s a few other common attributes (e.g. AGGREGATOR), and some less common ones (AS_PATHLIMIT and ATTR_SET). However some ASes have attached attribute 255 - the &lt;a href="https://www.rfc-editor.org/rfc/rfc2042.html">reserved for development&lt;/a> attribute - to their updates.&lt;/p>
&lt;p>At the time of receiving the updates my bgpsee daemon didn’t save value of these esoteric path attributes. But using &lt;a href="https://routeviews.org">routeviews.org&lt;/a> we can see that some ASes are still announcing paths with this attribute, and we can observe the raw bytes of its value:&lt;/p>
&lt;pre>&lt;code>- AS265999 attrib. 255 value: 0000 07DB 0000 0001 0001 000A FF08 0000 0000 0C49 75B3
- AS10429 attrib. 255 value: 0000 07DB 0000 0001 0001 000A FF08 0000 0003 43DC 75C3
- AS52564 attrib. 255 valuue: 0000 07DB 0000 0001 0001 0012 FF10 0000 0000 0C49 75B3 0000 0000 4003 F1C9
&lt;/code>&lt;/pre>
&lt;p>Three different ISPs, all announcing paths with this strange path attribute, and raw bytes of the attribute having a similar structure.&lt;/p>
&lt;p>This leads us to the second question which I’ll leave here unanswered:&lt;/p>
&lt;ul>
&lt;li>&lt;em>what vendor is deciding it’s a good idea to use this reserved for development attribute, and what are they using it for?&lt;/em>.&lt;/li>
&lt;/ul>
&lt;h1 id="flippy-flappy-whos-having-a-bad-time">Flippy-Flappy: Who’s Having a Bad Time?&lt;/h1>
&lt;p>Finally, let’s see who’s having a bad time: what are the top routes that are shifting paths or being withdrawn completely during the day. Here’s the top 10 active NLRIs with the number of times the route was included in an UPDATE:&lt;/p>
&lt;div id="alweeynynt" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
&lt;style>#alweeynynt table {
font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
&amp;#10;#alweeynynt thead, #alweeynynt tbody, #alweeynynt tfoot, #alweeynynt tr, #alweeynynt td, #alweeynynt th {
border-style: none;
}
&amp;#10;#alweeynynt p {
margin: 0;
padding: 0;
}
&amp;#10;#alweeynynt .gt_table {
display: table;
border-collapse: collapse;
line-height: normal;
margin-left: auto;
margin-right: auto;
color: #333333;
font-size: 16px;
font-weight: normal;
font-style: normal;
background-color: #FFFFFF;
width: auto;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #A8A8A8;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #A8A8A8;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
}
&amp;#10;#alweeynynt .gt_caption {
padding-top: 4px;
padding-bottom: 4px;
}
&amp;#10;#alweeynynt .gt_title {
color: #333333;
font-size: 125%;
font-weight: initial;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
border-bottom-color: #FFFFFF;
border-bottom-width: 0;
}
&amp;#10;#alweeynynt .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
padding-top: 3px;
padding-bottom: 5px;
padding-left: 5px;
padding-right: 5px;
border-top-color: #FFFFFF;
border-top-width: 0;
}
&amp;#10;#alweeynynt .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#alweeynynt .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#alweeynynt .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#alweeynynt .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
padding-left: 5px;
padding-right: 5px;
overflow-x: hidden;
}
&amp;#10;#alweeynynt .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
padding-top: 0;
padding-bottom: 0;
padding-left: 4px;
padding-right: 4px;
}
&amp;#10;#alweeynynt .gt_column_spanner_outer:first-child {
padding-left: 0;
}
&amp;#10;#alweeynynt .gt_column_spanner_outer:last-child {
padding-right: 0;
}
&amp;#10;#alweeynynt .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 5px;
overflow-x: hidden;
display: inline-block;
width: 100%;
}
&amp;#10;#alweeynynt .gt_spanner_row {
border-bottom-style: hidden;
}
&amp;#10;#alweeynynt .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
text-align: left;
}
&amp;#10;#alweeynynt .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: middle;
}
&amp;#10;#alweeynynt .gt_from_md > :first-child {
margin-top: 0;
}
&amp;#10;#alweeynynt .gt_from_md > :last-child {
margin-bottom: 0;
}
&amp;#10;#alweeynynt .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
margin: 10px;
border-top-style: solid;
border-top-width: 1px;
border-top-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
overflow-x: hidden;
}
&amp;#10;#alweeynynt .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#alweeynynt .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
vertical-align: top;
}
&amp;#10;#alweeynynt .gt_row_group_first td {
border-top-width: 2px;
}
&amp;#10;#alweeynynt .gt_row_group_first th {
border-top-width: 2px;
}
&amp;#10;#alweeynynt .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#alweeynynt .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
&amp;#10;#alweeynynt .gt_first_summary_row.thick {
border-top-width: 2px;
}
&amp;#10;#alweeynynt .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#alweeynynt .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#alweeynynt .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: double;
border-top-width: 6px;
border-top-color: #D3D3D3;
}
&amp;#10;#alweeynynt .gt_last_grand_summary_row_top {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: double;
border-bottom-width: 6px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#alweeynynt .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
&amp;#10;#alweeynynt .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#alweeynynt .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#alweeynynt .gt_footnote {
margin: 0px;
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#alweeynynt .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#alweeynynt .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#alweeynynt .gt_left {
text-align: left;
}
&amp;#10;#alweeynynt .gt_center {
text-align: center;
}
&amp;#10;#alweeynynt .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
&amp;#10;#alweeynynt .gt_font_normal {
font-weight: normal;
}
&amp;#10;#alweeynynt .gt_font_bold {
font-weight: bold;
}
&amp;#10;#alweeynynt .gt_font_italic {
font-style: italic;
}
&amp;#10;#alweeynynt .gt_super {
font-size: 65%;
}
&amp;#10;#alweeynynt .gt_footnote_marks {
font-size: 75%;
vertical-align: 0.4em;
position: initial;
}
&amp;#10;#alweeynynt .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
&amp;#10;#alweeynynt .gt_indent_1 {
text-indent: 5px;
}
&amp;#10;#alweeynynt .gt_indent_2 {
text-indent: 10px;
}
&amp;#10;#alweeynynt .gt_indent_3 {
text-indent: 15px;
}
&amp;#10;#alweeynynt .gt_indent_4 {
text-indent: 20px;
}
&amp;#10;#alweeynynt .gt_indent_5 {
text-indent: 25px;
}
&amp;#10;#alweeynynt .katex-display {
display: inline-flex !important;
margin-bottom: 0.75em !important;
}
&amp;#10;#alweeynynt div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
height: 0px !important;
}
&lt;/style>
&lt;table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
&lt;thead>
&lt;tr class="gt_col_headings">
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="nlri">nlri&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="update_count">update_count&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody class="gt_table_body">
&lt;tr>&lt;td headers="nlri" class="gt_row gt_right">140.99.244.0/23&lt;/td>
&lt;td headers="update_count" class="gt_row gt_right">2596&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="nlri" class="gt_row gt_right">107.154.97.0/24&lt;/td>
&lt;td headers="update_count" class="gt_row gt_right">2583&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="nlri" class="gt_row gt_right">45.172.92.0/22&lt;/td>
&lt;td headers="update_count" class="gt_row gt_right">2494&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="nlri" class="gt_row gt_right">151.236.111.0/24&lt;/td>
&lt;td headers="update_count" class="gt_row gt_right">2312&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="nlri" class="gt_row gt_right">205.164.85.0/24&lt;/td>
&lt;td headers="update_count" class="gt_row gt_right">2189&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="nlri" class="gt_row gt_right">41.209.0.0/18&lt;/td>
&lt;td headers="update_count" class="gt_row gt_right">2069&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="nlri" class="gt_row gt_right">143.255.204.0/22&lt;/td>
&lt;td headers="update_count" class="gt_row gt_right">2048&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="nlri" class="gt_row gt_right">176.124.58.0/24&lt;/td>
&lt;td headers="update_count" class="gt_row gt_right">1584&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="nlri" class="gt_row gt_right">187.1.11.0/24&lt;/td>
&lt;td headers="update_count" class="gt_row gt_right">1582&lt;/td>&lt;/tr>
&lt;tr>&lt;td headers="nlri" class="gt_row gt_right">187.1.13.0/24&lt;/td>
&lt;td headers="update_count" class="gt_row gt_right">1580&lt;/td>&lt;/tr>
&lt;/tbody>
&amp;#10;
&lt;/table>
&lt;/div>
&lt;p>Looks like anyone on &lt;strong>140.99.244.0/23&lt;/strong> was having a bad time during this day. This space is owned by a company called &lt;a href="https://www.epicup.com/">EpicUp&lt;/a>… more like EpicDown! *groan*.&lt;/p>
&lt;p>Graphing the updates and complete withdraws over the course of the day paints a bad picture&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2024-01-08-a-day-in-the-life-the-bgp-table/index_files/figure-html/unnamed-chunk-19-1.png" width="672" />
The top graph looks like a straight line, but that’s because this route is present in almost every single 30 second block of updates. There are 2,879 30-second blocks and it’s present as either a different path or a withdrawn route in 2,637 of them, or 92.8%!&lt;/p>
&lt;p>We know the routes is flapping, but &lt;em>how&lt;/em> is it flapping, and who is to blame? The best way to visualise this is a graph, with the ASNs in all paths to that network as nodes and edges showing the pairs of ASNs in the paths. I’ve colourised the edges by how many updates were seen with each pair of ASes, binned into groups of 300:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2024-01-08-a-day-in-the-life-the-bgp-table/index_files/figure-html/unnamed-chunk-21-1.png" width="672" />
What a mess! You can make out the primary path down the centre through NTT (2914) and Lumen/Level3 (3356), but for whatever reason (bad link? power outages? router crashing?) the path is moving between these tier 1 ISPS and others, including Arelion (1299) and PCCW (3419). While it’s almost impossible to identify the exact reason for the route flapping using this data only, what it does show is the amazing peering diversity of modern global networks, and the the resiliency of a 33 year old routing protocol.&lt;/p>
&lt;h1 id="just-the-beginning">Just The Beginning&lt;/h1>
&lt;p>There’s a big problem with a data set like this: there’s just too much to look at. I needed to keep a lid on it so this article didn’t balloon out to 30,000 words, but there’s another five rabbit holes I could have gone down. That’s not including the the questions I’ve left unanswered.&lt;/p>
&lt;p>With the global BGP table, you’ve got a summary of an entire world encapsulated in a few packets. Your BGP updates could could be political unrest, natural phenomena like earthquakes or fires, or simply a network admin’s fat finger. You’ve got the economics of internet peering, and you’ve got the human element of different administrators with different capabilities coming together to bring up connectivity. And somehow it manages to work, well, most of the time. There’s something both bizarre and beautiful about seeing all of that humanity encapsulated and streamed as small little updates into your laptop.&lt;/p></description></item><item><title>Cracking Open SCEP</title><link>https://clt.blog.foletta.net/post/2024-07-01-cracking-open-scep/</link><pubDate>Wed, 10 Jul 2024 00:00:00 +0000</pubDate><guid>https://clt.blog.foletta.net/post/2024-07-01-cracking-open-scep/</guid><description>&lt;p>Most of the posts on this site tend to be long form, a result of me finding it hard to leave stones unturned. This leads to big gaps between posts; in fact the the radio silence over the past nine months is because I&amp;rsquo;ve had two in draft form and haven&amp;rsquo;t been able to get them over the line.&lt;/p>
&lt;p>As an antidote to this I&amp;rsquo;ve put together something a little more bite-size. In this post we&amp;rsquo;re going to crack open a &lt;em>Simple Certificate Enrollment Protocol (SCEP)&lt;/em> request. We&amp;rsquo;ll do this on the command line, using the openssl tool peer underneath the hood, and get a good understanding of some of the structures, and the verification and encryption processes.&lt;/p>
&lt;h1 id="the-scep-request">The SCEP Request&lt;/h1>
&lt;p>Here&amp;rsquo;s a screenshot of a packet capture taken during a SCEP request for a new certificate:&lt;/p>
&lt;p>&lt;img src="scep_capture.png" alt="SCEP Capture">&lt;/p>
&lt;p>This SCEP request is actually two requests: the first returns an X509 CA certificate, and the second is the certificate request. We&amp;rsquo;ll see how the X509 certificate is used later on, but if we focus in on the second one we see the bulk of the request is passed in the &lt;em>message&lt;/em> query parameter. I&amp;rsquo;ve copied the contents of this to a file named &lt;em>scep_message&lt;/em>:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sh" data-lang="sh">&lt;span style="color:#75715e"># Size in bytes&lt;/span>
wc -c scep_message
&lt;span style="color:#ae81ff">4089&lt;/span>
&lt;span style="color:#75715e"># First 64 bytes&lt;/span>
cut -c1-64 scep_message
MIILLAYJKoZIhvcNAQcCoIILHTCCCxkCAQExDzANBglghkgBZQMEAgMFADCCBM8G
&lt;/code>&lt;/pre>&lt;/div>&lt;p>This message parameter contins the singing request, wrapped up like an onion (sometimes including the tears), with layer after layer of different encodings and structures. This first &lt;em>message&lt;/em> parameter is URI encoded, then base64 encoded, so we decode these store what I&amp;rsquo;ll call the &amp;lsquo;raw&amp;rsquo; SCEP in a file called &lt;em>scep_raw&lt;/em>.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sh" data-lang="sh">&lt;span style="color:#75715e"># Remove the URI and base64 encoding&lt;/span>
&amp;lt; scep_message perl -MURI::Escape -e &lt;span style="color:#e6db74">&amp;#39;print uri_unescape(&amp;lt;STDIN&amp;gt;)&amp;#39;&lt;/span> | base64 -d &amp;gt; scep_raw
&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="signing">Signing&lt;/h2>
&lt;p>Before moving on, a quick view of the PKI layout:&lt;/p>
&lt;ul>
&lt;li>The CN of the certificate request is &lt;strong>BlogPostCert&lt;/strong>.&lt;/li>
&lt;li>I&amp;rsquo;ve created a sub-CA with a CN of &lt;strong>Blog Post Sub CA&lt;/strong> that will sign the request.&lt;/li>
&lt;li>This sub-CA is signed by the root CA which has a CN of &lt;strong>foletta.xyz Root CA&lt;/strong>.&lt;/li>
&lt;/ul>
&lt;p>Now we can get into the meat and bones. After URI/base64 decoding, the next wrapper is &lt;a href="https://en.wikipedia.org/wiki/Cryptographic_Message_Syntax">Cryptographic Message Syntax (CMS)&lt;/a> encapsulated data. Originally part of the PKCS standards defined by RSA security (PKCS7 to be exact), CMS is now an IETF standard under &lt;a href="https://datatracker.ietf.org/doc/html/rfc5652">RFC 5652 &lt;/a>. It provides a way to digitally sign, digest, authenticate, or encrypt arbitrary message content.&lt;/p>
&lt;p>Using the openssl &lt;em>cms&lt;/em> command with the &lt;em>-print&lt;/em> argument, we look at the structure of this first CMS wrapper. I&amp;rsquo;ve redacted some of the less-relevant content and added some comments:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-{.sh" data-lang="{.sh">&lt;span style="color:#75715e"># Print the CMS structure&lt;/span>
openssl cms -in scep_raw -cmsout -inform DER -print
&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sh" data-lang="sh">CMS_ContentInfo:
contentType: pkcs7-signedData &lt;span style="color:#f92672">(&lt;/span>1.2.840.113549.1.7.2&lt;span style="color:#f92672">)&lt;/span>
d.signedData:
version: &lt;span style="color:#ae81ff">1&lt;/span>
digestAlgorithms:
algorithm: sha512 &lt;span style="color:#f92672">(&lt;/span>2.16.840.1.101.3.4.2.3&lt;span style="color:#f92672">)&lt;/span>
parameter: NULL
encapContentInfo:
eContentType: pkcs7-data &lt;span style="color:#f92672">(&lt;/span>1.2.840.113549.1.7.1&lt;span style="color:#f92672">)&lt;/span>
eContent:
&lt;span style="color:#75715e"># The verified content (removed for brevity)&lt;/span>
certificates:
&lt;span style="color:#75715e"># This is an inline certificate, partner of the private key that signed the content&lt;/span>
d.certificate:
cert_info:
version: &lt;span style="color:#ae81ff">2&lt;/span>
serialNumber: 0x3945353734443130303430394339423632364233303838354342353735443100
signature:
algorithm: sha512WithRSAEncryption &lt;span style="color:#f92672">(&lt;/span>1.2.840.113549.1.1.13&lt;span style="color:#f92672">)&lt;/span>
parameter: NULL
&lt;span style="color:#75715e"># We see this is a temporary, self-signed certificate&lt;/span>
issuer: C&lt;span style="color:#f92672">=&lt;/span>AU, ST&lt;span style="color:#f92672">=&lt;/span>Victoria, L&lt;span style="color:#f92672">=&lt;/span>Melbourne, O&lt;span style="color:#f92672">=&lt;/span>foletta.xyz, OU&lt;span style="color:#f92672">=&lt;/span>IT, CN&lt;span style="color:#f92672">=&lt;/span>BlogPostCert
&lt;span style="color:#75715e"># We get a week to sign the request&lt;/span>
validity:
notBefore: Jul &lt;span style="color:#ae81ff">8&lt;/span> 22:24:59 &lt;span style="color:#ae81ff">2024&lt;/span> GMT
notAfter: Jul &lt;span style="color:#ae81ff">15&lt;/span> 00:24:59 &lt;span style="color:#ae81ff">2024&lt;/span> GMT
subject: C&lt;span style="color:#f92672">=&lt;/span>AU, ST&lt;span style="color:#f92672">=&lt;/span>Victoria, L&lt;span style="color:#f92672">=&lt;/span>Melbourne, O&lt;span style="color:#f92672">=&lt;/span>foletta.xyz, OU&lt;span style="color:#f92672">=&lt;/span>IT, CN&lt;span style="color:#f92672">=&lt;/span>BlogPostCert
key: X509_PUBKEY:
algor:
algorithm: rsaEncryption &lt;span style="color:#f92672">(&lt;/span>1.2.840.113549.1.1.1&lt;span style="color:#f92672">)&lt;/span>
parameter: NULL
public_key: &lt;span style="color:#f92672">(&lt;/span>&lt;span style="color:#ae81ff">0&lt;/span> unused bits&lt;span style="color:#f92672">)&lt;/span>
&lt;span style="color:#75715e"># Removed for brevity&lt;/span>
issuerUID: &amp;lt;ABSENT&amp;gt;
subjectUID: &amp;lt;ABSENT&amp;gt;
extensions:
&amp;lt;ABSENT&amp;gt;
sig_alg:
algorithm: sha512WithRSAEncryption &lt;span style="color:#f92672">(&lt;/span>1.2.840.113549.1.1.13&lt;span style="color:#f92672">)&lt;/span>
parameter: NULL
signature: &lt;span style="color:#f92672">(&lt;/span>&lt;span style="color:#ae81ff">0&lt;/span> unused bits&lt;span style="color:#f92672">)&lt;/span>
&lt;span style="color:#75715e"># Signature to verify removed for brevity&lt;/span>
crls:
&amp;lt;ABSENT&amp;gt;
&lt;span style="color:#75715e"># The signing information to allow the content to be verified&lt;/span>
signerInfos:
version: &lt;span style="color:#ae81ff">1&lt;/span>
d.issuerAndSerialNumber:
&lt;span style="color:#75715e"># Issuer and serial number of the certificate required to verify.&lt;/span>
&lt;span style="color:#75715e"># This matches the above inline certificate&lt;/span>
issuer: C&lt;span style="color:#f92672">=&lt;/span>AU, ST&lt;span style="color:#f92672">=&lt;/span>Victoria, L&lt;span style="color:#f92672">=&lt;/span>Melbourne, O&lt;span style="color:#f92672">=&lt;/span>foletta.xyz, OU&lt;span style="color:#f92672">=&lt;/span>IT, CN&lt;span style="color:#f92672">=&lt;/span>BlogPostCert
serialNumber: 0x3945353734443130303430394339423632364233303838354342353735443100
digestAlgorithm:
algorithm: sha512 &lt;span style="color:#f92672">(&lt;/span>2.16.840.1.101.3.4.2.3&lt;span style="color:#f92672">)&lt;/span>
parameter: NULL
signedAttrs:
&lt;span style="color:#75715e"># Removed for brevity&lt;/span>
signatureAlgorithm:
algorithm: rsaEncryption &lt;span style="color:#f92672">(&lt;/span>1.2.840.113549.1.1.1&lt;span style="color:#f92672">)&lt;/span>
parameter: NULL
signature:
&lt;span style="color:#75715e"># Signature used to verify (removed for brevity).&lt;/span>
unsignedAttrs:
&amp;lt;ABSENT&amp;gt;
&lt;/code>&lt;/pre>&lt;/div>&lt;p>The main question I had was what signs this content? The answer we see from the above output is that it&amp;rsquo;s signed by the requestor&amp;rsquo;s newly generated private key. But as there&amp;rsquo;s no certificate yet (that&amp;rsquo;s the whole point of the request), the requestor creates a temporary self-signed certificate containing the public key, and includes it in the CMS data. This allows the SCEP server to authenticate the data that&amp;rsquo;s been transferred.&lt;/p>
&lt;p>This self-signed certificate will come in handy later, so it&amp;rsquo;s extracted using the &lt;em>-verify&lt;/em> and &lt;em>-signer&lt;/em> arguments:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sh" data-lang="sh">&lt;span style="color:#75715e"># Extract the self signed certificate&lt;/span>
openssl cms -verify -in scep_raw -inform DER -signer self_signed.cer -noverify -out /dev/null
&lt;span style="color:#75715e"># View the self-signed cert&lt;/span>
openssl x509 -in self_signed.cer -noout -text
&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sh" data-lang="sh">Certificate:
Data:
Version: &lt;span style="color:#ae81ff">3&lt;/span> &lt;span style="color:#f92672">(&lt;/span>0x2&lt;span style="color:#f92672">)&lt;/span>
Serial Number:
39:45:35:37:34:44:31:30:30:34:30:39:43:39:42:36:32:36:42:33:30:38:38:35:43:42:35:37:35:44:31:00
Signature Algorithm: sha512WithRSAEncryption
Issuer: C &lt;span style="color:#f92672">=&lt;/span> AU, ST &lt;span style="color:#f92672">=&lt;/span> Victoria, L &lt;span style="color:#f92672">=&lt;/span> Melbourne, O &lt;span style="color:#f92672">=&lt;/span> foletta.xyz, OU &lt;span style="color:#f92672">=&lt;/span> IT, CN &lt;span style="color:#f92672">=&lt;/span> BlogPostCert
Validity
Not Before: Jul &lt;span style="color:#ae81ff">8&lt;/span> 22:24:59 &lt;span style="color:#ae81ff">2024&lt;/span> GMT
Not After : Jul &lt;span style="color:#ae81ff">15&lt;/span> 00:24:59 &lt;span style="color:#ae81ff">2024&lt;/span> GMT
Subject: C &lt;span style="color:#f92672">=&lt;/span> AU, ST &lt;span style="color:#f92672">=&lt;/span> Victoria, L &lt;span style="color:#f92672">=&lt;/span> Melbourne, O &lt;span style="color:#f92672">=&lt;/span> foletta.xyz, OU &lt;span style="color:#f92672">=&lt;/span> IT, CN &lt;span style="color:#f92672">=&lt;/span> BlogPostCert
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
Public-Key: &lt;span style="color:#f92672">(&lt;/span>&lt;span style="color:#ae81ff">2048&lt;/span> bit&lt;span style="color:#f92672">)&lt;/span>
Modulus:
&lt;span style="color:#75715e"># Removed for brevity&lt;/span>
Exponent: &lt;span style="color:#ae81ff">65537&lt;/span> &lt;span style="color:#f92672">(&lt;/span>0x10001&lt;span style="color:#f92672">)&lt;/span>
Signature Algorithm: sha512WithRSAEncryption
Signature Value:
&lt;span style="color:#75715e"># Removed for brevity&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="encryption">Encryption&lt;/h2>
&lt;p>The keen eyed will have noticed that the &lt;code>eContentType&lt;/code> was &lt;code>pkcs7-data&lt;/code>. I.e. inside this CMS encapsulation is another CMS encapsulation, except this one is responsible for encrypting the certificate request.&lt;/p>
&lt;p>Using the &lt;em>-verify&lt;/em> command we can verify the signature and extract the content, piping to openssl again to view the structure of the encrypted CMS. The seemingly contradictory &lt;em>-noverify&lt;/em> disables verification of the signing certificate of the message (while still checking the actual signature) as we can&amp;rsquo;t verify that self-signed certificate.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-{.sh" data-lang="{.sh">&lt;span style="color:#75715e"># Verify, extract, and pipe out contents&lt;/span>
openssl cms -verify -in scep_raw -inform DER -noverify |
&lt;span style="color:#75715e"># Print second CMS structure&lt;/span>
openssl cms -inform DER -cmsout -print
&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sh" data-lang="sh">CMS_ContentInfo:
contentType: pkcs7-envelopedData &lt;span style="color:#f92672">(&lt;/span>1.2.840.113549.1.7.3&lt;span style="color:#f92672">)&lt;/span>
d.envelopedData:
version: &lt;span style="color:#ae81ff">0&lt;/span>
originatorInfo: &amp;lt;ABSENT&amp;gt;
recipientInfos:
d.ktri:
version: &lt;span style="color:#ae81ff">0&lt;/span>
d.issuerAndSerialNumber:
&lt;span style="color:#75715e"># CN of the issuer and serial of the certificate/keypair required to decrypt the contents&lt;/span>
issuer: C&lt;span style="color:#f92672">=&lt;/span>AU, ST&lt;span style="color:#f92672">=&lt;/span>Victoria, L&lt;span style="color:#f92672">=&lt;/span>Melbourne, CN&lt;span style="color:#f92672">=&lt;/span>foletta.xyz Root CA/emailAddress&lt;span style="color:#f92672">=&lt;/span>greg@foletta.org
serialNumber: &lt;span style="color:#ae81ff">13257416122132238758&lt;/span>
keyEncryptionAlgorithm:
algorithm: rsaEncryption &lt;span style="color:#f92672">(&lt;/span>1.2.840.113549.1.1.1&lt;span style="color:#f92672">)&lt;/span>
parameter: NULL
encryptedKey:
&lt;span style="color:#75715e"># 255 byte random key, encrypted with the RSA certificate&lt;/span>
encryptedContentInfo:
contentType: pkcs7-data &lt;span style="color:#f92672">(&lt;/span>1.2.840.113549.1.7.1&lt;span style="color:#f92672">)&lt;/span>
contentEncryptionAlgorithm:
&lt;span style="color:#75715e"># Note the symmetric cipher below&lt;/span>
algorithm: des-ede3-cbc &lt;span style="color:#f92672">(&lt;/span>1.2.840.113549.3.7&lt;span style="color:#f92672">)&lt;/span>
&lt;span style="color:#75715e"># I *think* this is the initialisation vector for 3DES&lt;/span>
parameter: OCTET STRING:
&lt;span style="color:#ae81ff">0000&lt;/span> - &lt;span style="color:#ae81ff">01&lt;/span> ed &lt;span style="color:#ae81ff">63&lt;/span> &lt;span style="color:#ae81ff">51&lt;/span> &lt;span style="color:#ae81ff">42&lt;/span> &lt;span style="color:#ae81ff">91&lt;/span> 6e a0- ..cQB.n.
encryptedContent:
&lt;span style="color:#75715e"># Content encrypted with the symmetric key&lt;/span>
unprotectedAttrs:
&amp;lt;ABSENT&amp;gt;
&lt;/code>&lt;/pre>&lt;/div>&lt;p>The two main sections are &lt;em>encryptedKey&lt;/em> and &lt;em>encryptedContent&lt;/em>. The content-encryption key is randomly generated and used to encrypt the data with a symmetric cipher (3DES), then the key itself is encrypted using the public key of the signing CA that was requested in the first step. I have a copy of the private key of the signing CA (&amp;ldquo;Blog Post SubCA&amp;rdquo;), so we can decrypt the content and look at the request.&lt;/p>
&lt;h2 id="the-signing-request">The Signing Request&lt;/h2>
&lt;p>Using the &lt;em>-decrypt&lt;/em> option and the sub-CA certificate/private key, we can decrypt the second CMS to take a look at the certificate signing request, again redacted for brevity:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sh" data-lang="sh">&lt;span style="color:#75715e"># Extract, verify and pipe out content&lt;/span>
openssl cms -in scep_raw -verify -inform DER -noverify |
&lt;span style="color:#75715e"># Decrypt and pipe out content&lt;/span>
openssl cms -inform DER -decrypt -recip Blog_Post_SubCA.cer -inkey Blog_Post_SubCA.key |
&lt;span style="color:#75715e"># Parse certificate request&lt;/span>
openssl req -inform DER -noout -text
&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sh" data-lang="sh">Certificate Request:
Data:
Version: &lt;span style="color:#ae81ff">1&lt;/span> &lt;span style="color:#f92672">(&lt;/span>0x0&lt;span style="color:#f92672">)&lt;/span>
Subject: C &lt;span style="color:#f92672">=&lt;/span> AU, ST &lt;span style="color:#f92672">=&lt;/span> Victoria, L &lt;span style="color:#f92672">=&lt;/span> Melbourne, O &lt;span style="color:#f92672">=&lt;/span> foletta.xyz, OU &lt;span style="color:#f92672">=&lt;/span> IT, CN &lt;span style="color:#f92672">=&lt;/span> BlogPostCert
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
Public-Key: &lt;span style="color:#f92672">(&lt;/span>&lt;span style="color:#ae81ff">2048&lt;/span> bit&lt;span style="color:#f92672">)&lt;/span>
Modulus:
&lt;span style="color:#75715e"># Removed for brevity&lt;/span>
Exponent: &lt;span style="color:#ae81ff">65537&lt;/span> &lt;span style="color:#f92672">(&lt;/span>0x10001&lt;span style="color:#f92672">)&lt;/span>
Attributes:
challengePassword :8qKfdeen
Requested Extensions:
Signature Algorithm: sha256WithRSAEncryption
Signature Value:
&lt;span style="color:#75715e"># Removed for brevity&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;p>In the core we&amp;rsquo;ve got a bog-standard certificate request ready for signing, with no fancy requested extensions. The only attribute is the SCEP challenge password.&lt;/p>
&lt;p>A quick aside: as per &lt;a href="https://www.rfc-editor.org/rfc/rfc4210#section-5.2.1">RFC 4210&lt;/a>, the signer is able to change any field in this CSR except the public key.&lt;/p>
&lt;h1 id="scep-response">SCEP Response&lt;/h1>
&lt;p>The response from the SCEP server containing the certificate is similar to the request:&lt;/p>
&lt;ul>
&lt;li>The verification CMS, signed using the public key in the certificate request, allowing to requestor to verify it with their generated private key&lt;/li>
&lt;li>The encrypted CMS, with a key encrypted by self-signed certificate that was sent in the request, allowing the requestor to decrypt using it&amp;rsquo;s generated private key.&lt;/li>
&lt;/ul>
&lt;p>The difference is at the core is a degenerate case of the SignedData, with no signers and content, just the certificates. In our case, it has our newly-signed certificate, as well as the certificate of the CA that signed it. The requestor now has the certificate to use as a server certficate on a web-server, or as a client certificate to authenticate themselves to a service.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sh" data-lang="sh">&lt;span style="color:#75715e"># Extract, verify, and pipe response&lt;/span>
openssl cms -verify -in scep_response -inform der |
&lt;span style="color:#75715e"># Decrypt and pipe response&lt;/span>
openssl cms -decrypt -inform der -recip self_signed.cer -inkey blog.key |
&lt;span style="color:#75715e"># View &amp;#39;degenerate&amp;#39; signed data certificates&lt;/span>
&lt;span style="color:#75715e"># I can&amp;#39;t get CMS to open it, so I use pkcs7&lt;/span>
openssl pkcs7 -inform der -noout -print_certs -text
&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sh" data-lang="sh">&lt;span style="color:#75715e"># This first certificate is the signed response to our request&lt;/span>
Certificate:
Data:
Version: &lt;span style="color:#ae81ff">3&lt;/span> &lt;span style="color:#f92672">(&lt;/span>0x2&lt;span style="color:#f92672">)&lt;/span>
Serial Number: &lt;span style="color:#ae81ff">5037000822208222218&lt;/span> &lt;span style="color:#f92672">(&lt;/span>0x45e7058f8512540a&lt;span style="color:#f92672">)&lt;/span>
Signature Algorithm: sha256WithRSAEncryption
Issuer: C&lt;span style="color:#f92672">=&lt;/span>AU, ST&lt;span style="color:#f92672">=&lt;/span>Victoria, L&lt;span style="color:#f92672">=&lt;/span>Melbourne, O&lt;span style="color:#f92672">=&lt;/span>foletta.xyz, OU&lt;span style="color:#f92672">=&lt;/span>IT, CN&lt;span style="color:#f92672">=&lt;/span>Blog Post Sub CA
Validity
Not Before: Jul &lt;span style="color:#ae81ff">8&lt;/span> 22:25:03 &lt;span style="color:#ae81ff">2024&lt;/span> GMT
Not After : Jul &lt;span style="color:#ae81ff">8&lt;/span> 22:25:03 &lt;span style="color:#ae81ff">2025&lt;/span> GMT
Subject: C&lt;span style="color:#f92672">=&lt;/span>AU, ST&lt;span style="color:#f92672">=&lt;/span>Victoria, L&lt;span style="color:#f92672">=&lt;/span>Melbourne, O&lt;span style="color:#f92672">=&lt;/span>foletta.xyz, OU&lt;span style="color:#f92672">=&lt;/span>IT, CN&lt;span style="color:#f92672">=&lt;/span>BlogPostCert
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
Public-Key: &lt;span style="color:#f92672">(&lt;/span>&lt;span style="color:#ae81ff">2048&lt;/span> bit&lt;span style="color:#f92672">)&lt;/span>
Modulus:
&lt;span style="color:#75715e"># Removed for brevity&lt;/span>
Exponent: &lt;span style="color:#ae81ff">65537&lt;/span> &lt;span style="color:#f92672">(&lt;/span>0x10001&lt;span style="color:#f92672">)&lt;/span>
X509v3 extensions:
&lt;span style="color:#75715e"># We can&amp;#39;t use this certificate to issue other certificates&lt;/span>
X509v3 Basic Constraints: critical
CA:FALSE
&lt;span style="color:#75715e"># Identifier of the public key of this certificate&lt;/span>
X509v3 Subject Key Identifier:
7A:DC:05:7B:2C:A8:E3:F8:9C:E6:40:72:6B:50:4F:81:85:C0:9C:6B
&lt;span style="color:#75715e"># Identifier of the public key of the CA that signed this certificate&lt;/span>
X509v3 Authority Key Identifier:
keyid:7D:10:79:F8:A3:27:D5:8A:31:1C:82:1C:29:40:DF:AD:6B:61:44:D1
DirName:/C&lt;span style="color:#f92672">=&lt;/span>AU/ST&lt;span style="color:#f92672">=&lt;/span>Victoria/L&lt;span style="color:#f92672">=&lt;/span>Melbourne/CN&lt;span style="color:#f92672">=&lt;/span>foletta.xyz Root CA/emailAddress&lt;span style="color:#f92672">=&lt;/span>greg@foletta.org
serial:B7:FB:CD:B8:E7:3A:91:A6
Signature Algorithm: sha256WithRSAEncryption
Signature Value:
&lt;span style="color:#75715e"># Removed for brevity&lt;/span>
&lt;span style="color:#75715e"># This certificate is a copy of the sub-CA that signed the certificate, with the bulk removed for brevity&lt;/span>
Certificate:
Data:
Version: &lt;span style="color:#ae81ff">3&lt;/span> &lt;span style="color:#f92672">(&lt;/span>0x2&lt;span style="color:#f92672">)&lt;/span>
Serial Number:
b7:fb:cd:b8:e7:3a:91:a6
Signature Algorithm: sha256WithRSAEncryption
Issuer: C&lt;span style="color:#f92672">=&lt;/span>AU, ST&lt;span style="color:#f92672">=&lt;/span>Victoria, L&lt;span style="color:#f92672">=&lt;/span>Melbourne, CN&lt;span style="color:#f92672">=&lt;/span>foletta.xyz Root CA/emailAddress&lt;span style="color:#f92672">=&lt;/span>greg@foletta.org
Validity
Not Before: Jul &lt;span style="color:#ae81ff">1&lt;/span> 02:27:55 &lt;span style="color:#ae81ff">2024&lt;/span> GMT
Not After : Sep &lt;span style="color:#ae81ff">26&lt;/span> 06:21:46 &lt;span style="color:#ae81ff">2032&lt;/span> GMT
Subject: C&lt;span style="color:#f92672">=&lt;/span>AU, ST&lt;span style="color:#f92672">=&lt;/span>Victoria, L&lt;span style="color:#f92672">=&lt;/span>Melbourne, O&lt;span style="color:#f92672">=&lt;/span>foletta.xyz, OU&lt;span style="color:#f92672">=&lt;/span>IT, CN&lt;span style="color:#f92672">=&lt;/span>Blog Post Sub CA
&lt;span style="color:#75715e"># Rest of the CA certificate follows from here&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;h1 id="summary">Summary&lt;/h1>
&lt;p>Not sure if this ended up being &amp;lsquo;bite-size&amp;rsquo; in the end, but it was an enjoyable challenge to take the request and response and peel back the layers. The openssl application has got an immense amount of functionality fronted by a pretty hard-to-use interface. I find challenges like this are the best way to get familiar with its incantations, as opposed to copy and pasting commands in from the internet.&lt;/p></description></item><item><title>Keeping Them Honest</title><link>https://clt.blog.foletta.net/post/2023-09-28-honest-insurance-company/</link><pubDate>Wed, 25 Oct 2023 00:00:00 +0000</pubDate><guid>https://clt.blog.foletta.net/post/2023-09-28-honest-insurance-company/</guid><description>&lt;script src="https://clt.blog.foletta.net/post/2023-09-28-honest-insurance-company/index_files/core-js/shim.min.js">&lt;/script>
&lt;script src="https://clt.blog.foletta.net/post/2023-09-28-honest-insurance-company/index_files/react/react.min.js">&lt;/script>
&lt;script src="https://clt.blog.foletta.net/post/2023-09-28-honest-insurance-company/index_files/react/react-dom.min.js">&lt;/script>
&lt;script src="https://clt.blog.foletta.net/post/2023-09-28-honest-insurance-company/index_files/reactwidget/react-tools.js">&lt;/script>
&lt;script src="https://clt.blog.foletta.net/post/2023-09-28-honest-insurance-company/index_files/htmlwidgets/htmlwidgets.js">&lt;/script>
&lt;link href="https://clt.blog.foletta.net/post/2023-09-28-honest-insurance-company/index_files/reactable/reactable.css" rel="stylesheet" />
&lt;script src="https://clt.blog.foletta.net/post/2023-09-28-honest-insurance-company/index_files/reactable-binding/reactable.js">&lt;/script>
&lt;script src="https://clt.blog.foletta.net/post/2023-09-28-honest-insurance-company/index_files/core-js/shim.min.js">&lt;/script>
&lt;script src="https://clt.blog.foletta.net/post/2023-09-28-honest-insurance-company/index_files/react/react.min.js">&lt;/script>
&lt;script src="https://clt.blog.foletta.net/post/2023-09-28-honest-insurance-company/index_files/react/react-dom.min.js">&lt;/script>
&lt;script src="https://clt.blog.foletta.net/post/2023-09-28-honest-insurance-company/index_files/reactwidget/react-tools.js">&lt;/script>
&lt;script src="https://clt.blog.foletta.net/post/2023-09-28-honest-insurance-company/index_files/htmlwidgets/htmlwidgets.js">&lt;/script>
&lt;link href="https://clt.blog.foletta.net/post/2023-09-28-honest-insurance-company/index_files/reactable/reactable.css" rel="stylesheet" />
&lt;script src="https://clt.blog.foletta.net/post/2023-09-28-honest-insurance-company/index_files/reactable-binding/reactable.js">&lt;/script>
&lt;p>Last month my car, a Toyota Kluger, was hit while parked in front of my house. Luckily no one was injured and while annoying, the person had insurance. The insurance company came back and determined that the car had been written off and I would be paid out the market value of the car. The question in my mind was ‘what is the market value?’ How could I keep the insurance company honest and make sure I wasn’t getting stiffed?&lt;/p>
&lt;p>In this post I’ll go through an attempt to find out the market price for a Toyota Kluger. We’ll automate the retrieval of data and have a look at its features, create a Bayesian model, and see how this model performs in predicting the sale price of a Kluger.&lt;/p>
&lt;p>More than anything, this was a chance to put into practice the Bayesian theory I’ve been learning in books like &lt;a href="https://xcelab.net/rm/statistical-rethinking/">Statistical Rethinking&lt;/a> and &lt;a href="https://www.bayesrulesbook.com/">Bayes Rules!&lt;/a> books. With any first dip of the toe, there could be assumptions I make that are incorrect, or things that are outright wrong. I’d appreciate &lt;a href="mailto:greg@foletta.org">feedback and corrections&lt;/a>.&lt;/p>
&lt;p>Finally, I won’t be showing as much of the code as I’ve done in previous posts. If you’d like to dive under the hood, you can find the source for this article &lt;a href="https://github.com/gregfoletta/articles.foletta.org/blob/production/content/post/2023-09-28-honest-insurance-company/index.Rmarkdown">here&lt;/a>.&lt;/p>
&lt;h1 id="tldr">TL;DR&lt;/h1>
&lt;p>A word of warning before anyone gets too invested: this article is slightly anticlimactic. We do find that the sale price of a Kluger will reduce by around .6% for every 1,000 km driven, but there’s still a lot of variability that we dont capture, making our prediction intervals too wide to be of any use. There’s a other factors that go into determining a price that we need to take into account in our model.&lt;/p>
&lt;p>But this isn’t the end, it’s just the start. Better to start off with a simple model, assess, and slowly increase the complexity, rather than throwing a whole bathtub of features at the model straight up. It also leaves me with material for another article!&lt;/p>
&lt;h1 id="data-aquisition">Data Aquisition&lt;/h1>
&lt;p>The first step is to acquire some data on the current market for Toyota Klugers. A small distinction is that data will be the &lt;em>for sale&lt;/em> price of the car, rather than the &lt;em>sold&lt;/em> price, but it still should provide us with a good representation of the market.&lt;/p>
&lt;p>We’ll pull the data from a site that advertises cars for sale. The site requires Javascript to render, so a simple HTTP GET of the site won’t work. Instead we need to render the page in a browser. We’ll use a docker instance of the webdriver &lt;a href="https://www.selenium.dev/">Selenium&lt;/a>, interfacing into this with the R package &lt;a href="https://github.com/ropensci/RSelenium">RSelenium&lt;/a> to achieve this. This allows us to browse to the site from a ‘remotely controller’ browser, Javascript and all, and retrieve the information we need.&lt;/p>
&lt;p>We connect to the docker instance, setting the page load strategy to eager. This will speed up the process as we won’t be waiting for stylesheets, images, etc to load.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">rs &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">remoteDriver&lt;/span>(remoteServerAddr &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;172.17.0.2&amp;#39;&lt;/span>, port &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">4444L&lt;/span>)
rs&lt;span style="color:#f92672">$&lt;/span>extraCapabilities&lt;span style="color:#f92672">$&lt;/span>pageLoadStrategy &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#e6db74">&amp;#34;eager&amp;#34;&lt;/span>
rs&lt;span style="color:#f92672">$&lt;/span>&lt;span style="color:#a6e22e">open&lt;/span>()
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Each page of Klugers for sale is determined by a query string offsetting into the list of Klugers in multiples of 12. We generate the offsets (12, 24, 36, …) and from this the full URI of each page. We then navigate to each page, read the source, and parse into a structured XML document.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">kluger_source &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
&lt;span style="color:#a6e22e">tibble&lt;/span>(
&lt;span style="color:#75715e"># Generate offsets&lt;/span>
offset &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">12&lt;/span> &lt;span style="color:#f92672">*&lt;/span> &lt;span style="color:#a6e22e">c&lt;/span>(&lt;span style="color:#ae81ff">0&lt;/span>&lt;span style="color:#f92672">:&lt;/span>&lt;span style="color:#ae81ff">100&lt;/span>),
&lt;span style="color:#75715e"># Create URIs based on offsets&lt;/span>
uri &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">glue&lt;/span>(&lt;span style="color:#e6db74">&amp;#34;{car_site_uri}/cars/used/toyota/kluger/?offset={offset}&amp;#34;&lt;/span>)
) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(
&lt;span style="color:#75715e"># Naviate to each URI, read and parse the source&lt;/span>
source &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">map&lt;/span>(uri, &lt;span style="color:#f92672">~&lt;/span>{
rs&lt;span style="color:#f92672">$&lt;/span>&lt;span style="color:#a6e22e">navigate&lt;/span>(uri)
rs&lt;span style="color:#f92672">$&lt;/span>&lt;span style="color:#a6e22e">getPageSource&lt;/span>() &lt;span style="color:#f92672">|&amp;gt;&lt;/span> &lt;span style="color:#a6e22e">pluck&lt;/span>(&lt;span style="color:#ae81ff">1&lt;/span>) &lt;span style="color:#f92672">|&amp;gt;&lt;/span> &lt;span style="color:#a6e22e">read_html&lt;/span>()
} )
)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>With the raw source in our hands, we can move on to extracting the pieces of data we need from each of them.&lt;/p>
&lt;h1 id="data-extraction">Data Extraction&lt;/h1>
&lt;p>Let’s define a small helper function &lt;code>xpt()&lt;/code> to make things a little more concise.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># XPath helper function, xpt short for xpath_text&lt;/span>
xpt &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">function&lt;/span>(html, xpath) {
&lt;span style="color:#a6e22e">html_elements&lt;/span>(html, xpath &lt;span style="color:#f92672">=&lt;/span> xpath) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">html_text&lt;/span>()
}
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Each page has ‘cards’ which contain the details of each car. We ran into an issue is where not all of them have an odometer reading, which is the critical variable we’re going to use in our modelling later. To get around this, slightly more complicated XPath is required: we find each card by first finding all the odometer &amp;lt;li&amp;gt; tags, then using the &lt;em>ancestor::&lt;/em> axes we find the card &amp;lt;div&amp;gt; that it sits within. The result: we have all cards which have odometer readings.&lt;/p>
&lt;p>From there, it’s trivial to extract specific properties from the car sale.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">kluger_data &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
kluger_source &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#75715e"># Find the parent card &amp;lt;div&amp;gt; of all odometer &amp;lt;li&amp;gt; tags.&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(
cards &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">map&lt;/span>(source,
&lt;span style="color:#f92672">~&lt;/span>&lt;span style="color:#a6e22e">html_elements&lt;/span>(.x,
xpath &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#34;//li[@data-type = &amp;#39;Odometer&amp;#39;]/ancestor::div[@class = &amp;#39;card-body&amp;#39;]&amp;#34;&lt;/span>
)
)
) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#75715e"># Extract specific values from the card found above&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(
price &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">map&lt;/span>(cards, &lt;span style="color:#f92672">~&lt;/span>&lt;span style="color:#a6e22e">xpt&lt;/span>(.x, xpath &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#34;.//a[@data-webm-clickvalue = &amp;#39;sv-price&amp;#39;]&amp;#34;&lt;/span>)),
title &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">map&lt;/span>(cards, &lt;span style="color:#f92672">~&lt;/span>&lt;span style="color:#a6e22e">xpt&lt;/span>(.x, xpath &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#34;.//a[@data-webm-clickvalue = &amp;#39;sv-title&amp;#39;]&amp;#34;&lt;/span>)),
odometer &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">map&lt;/span>(cards, &lt;span style="color:#f92672">~&lt;/span>&lt;span style="color:#a6e22e">xpt&lt;/span>(.x, xpath &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#34;.//li[@data-type = &amp;#39;Odometer&amp;#39;]&amp;#34;&lt;/span>)),
body &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">map&lt;/span>(cards, &lt;span style="color:#f92672">~&lt;/span>&lt;span style="color:#a6e22e">xpt&lt;/span>(.x, xpath &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#34;.//li[@data-type = &amp;#39;Body Style&amp;#39;]&amp;#34;&lt;/span>)),
transmission &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">map&lt;/span>(cards, &lt;span style="color:#f92672">~&lt;/span>&lt;span style="color:#a6e22e">xpt&lt;/span>(.x, xpath &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#34;.//li[@data-type = &amp;#39;Transmission&amp;#39;]&amp;#34;&lt;/span>)),
engine &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">map&lt;/span>(cards, &lt;span style="color:#f92672">~&lt;/span>&lt;span style="color:#a6e22e">xpt&lt;/span>(.x, xpath &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#34;.//li[@data-type = &amp;#39;Engine&amp;#39;]&amp;#34;&lt;/span>))
) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">select&lt;/span>(&lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#a6e22e">c&lt;/span>(source, cards, offset)) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">unnest&lt;/span>(&lt;span style="color:#a6e22e">everything&lt;/span>())
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Here’s a sample our raw data:&lt;/p>
&lt;div id="qnxgxzutkf" class=".gt_table" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
&lt;style>#qnxgxzutkf table {
font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
&amp;#10;#qnxgxzutkf thead, #qnxgxzutkf tbody, #qnxgxzutkf tfoot, #qnxgxzutkf tr, #qnxgxzutkf td, #qnxgxzutkf th {
border-style: none;
}
&amp;#10;#qnxgxzutkf p {
margin: 0;
padding: 0;
}
&amp;#10;#qnxgxzutkf .gt_table {
display: table;
border-collapse: collapse;
line-height: normal;
margin-left: auto;
margin-right: auto;
color: #333333;
font-size: 16px;
font-weight: normal;
font-style: normal;
background-color: #FFFFFF;
width: auto;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #A8A8A8;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #A8A8A8;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
}
&amp;#10;#qnxgxzutkf .gt_caption {
padding-top: 4px;
padding-bottom: 4px;
}
&amp;#10;#qnxgxzutkf .gt_title {
color: #333333;
font-size: 125%;
font-weight: initial;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
border-bottom-color: #FFFFFF;
border-bottom-width: 0;
}
&amp;#10;#qnxgxzutkf .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
padding-top: 3px;
padding-bottom: 5px;
padding-left: 5px;
padding-right: 5px;
border-top-color: #FFFFFF;
border-top-width: 0;
}
&amp;#10;#qnxgxzutkf .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#qnxgxzutkf .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#qnxgxzutkf .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#qnxgxzutkf .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
padding-left: 5px;
padding-right: 5px;
overflow-x: hidden;
}
&amp;#10;#qnxgxzutkf .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
padding-top: 0;
padding-bottom: 0;
padding-left: 4px;
padding-right: 4px;
}
&amp;#10;#qnxgxzutkf .gt_column_spanner_outer:first-child {
padding-left: 0;
}
&amp;#10;#qnxgxzutkf .gt_column_spanner_outer:last-child {
padding-right: 0;
}
&amp;#10;#qnxgxzutkf .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 5px;
overflow-x: hidden;
display: inline-block;
width: 100%;
}
&amp;#10;#qnxgxzutkf .gt_spanner_row {
border-bottom-style: hidden;
}
&amp;#10;#qnxgxzutkf .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
text-align: left;
}
&amp;#10;#qnxgxzutkf .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: middle;
}
&amp;#10;#qnxgxzutkf .gt_from_md > :first-child {
margin-top: 0;
}
&amp;#10;#qnxgxzutkf .gt_from_md > :last-child {
margin-bottom: 0;
}
&amp;#10;#qnxgxzutkf .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
margin: 10px;
border-top-style: solid;
border-top-width: 1px;
border-top-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
overflow-x: hidden;
}
&amp;#10;#qnxgxzutkf .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#qnxgxzutkf .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
vertical-align: top;
}
&amp;#10;#qnxgxzutkf .gt_row_group_first td {
border-top-width: 2px;
}
&amp;#10;#qnxgxzutkf .gt_row_group_first th {
border-top-width: 2px;
}
&amp;#10;#qnxgxzutkf .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#qnxgxzutkf .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
&amp;#10;#qnxgxzutkf .gt_first_summary_row.thick {
border-top-width: 2px;
}
&amp;#10;#qnxgxzutkf .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#qnxgxzutkf .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#qnxgxzutkf .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: double;
border-top-width: 6px;
border-top-color: #D3D3D3;
}
&amp;#10;#qnxgxzutkf .gt_last_grand_summary_row_top {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: double;
border-bottom-width: 6px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#qnxgxzutkf .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
&amp;#10;#qnxgxzutkf .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#qnxgxzutkf .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#qnxgxzutkf .gt_footnote {
margin: 0px;
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#qnxgxzutkf .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#qnxgxzutkf .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#qnxgxzutkf .gt_left {
text-align: left;
}
&amp;#10;#qnxgxzutkf .gt_center {
text-align: center;
}
&amp;#10;#qnxgxzutkf .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
&amp;#10;#qnxgxzutkf .gt_font_normal {
font-weight: normal;
}
&amp;#10;#qnxgxzutkf .gt_font_bold {
font-weight: bold;
}
&amp;#10;#qnxgxzutkf .gt_font_italic {
font-style: italic;
}
&amp;#10;#qnxgxzutkf .gt_super {
font-size: 65%;
}
&amp;#10;#qnxgxzutkf .gt_footnote_marks {
font-size: 75%;
vertical-align: 0.4em;
position: initial;
}
&amp;#10;#qnxgxzutkf .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
&amp;#10;#qnxgxzutkf .gt_indent_1 {
text-indent: 5px;
}
&amp;#10;#qnxgxzutkf .gt_indent_2 {
text-indent: 10px;
}
&amp;#10;#qnxgxzutkf .gt_indent_3 {
text-indent: 15px;
}
&amp;#10;#qnxgxzutkf .gt_indent_4 {
text-indent: 20px;
}
&amp;#10;#qnxgxzutkf .gt_indent_5 {
text-indent: 25px;
}
&lt;/style>
&lt;div style="font-family:system-ui, &amp;#39;Segoe UI&amp;#39;, Roboto, Helvetica, Arial, sans-serif;border-top-style:solid;border-top-width:2px;border-top-color:#D3D3D3;">
&lt;div class="gt_heading gt_title gt_font_normal" style="text-size:bigger;">Kluger Market Data&lt;/div>
&lt;div class="gt_heading gt_subtitle ">&lt;/div>
&lt;/div>
&lt;div id="qnxgxzutkf" class="reactable html-widget " style="width:auto;height:auto;">&lt;/div>
&lt;script type="application/json" data-for="qnxgxzutkf">{"x":{"tag":{"name":"Reactable","attribs":{"data":{"price":["$15,000*","$33,990*","$13,800*","$33,990*","$42,990","$73,990","$38,700*","$48,990","$35,990","$20,500*","$71,990","$41,990","$20,800*","$69,990","$58,999","$27,888","$34,990*","$61,990","$16,990","$61,880","$46,990","$35,977*","$69,990*","$35,900*","$34,990*","$40,000*","$48,990","$59,888*","$57,887*","$48,000*"],"title":["2011 Toyota Kluger KX-R Auto 2WD MY11","2014 Toyota Kluger Grande Auto 2WD","2008 Toyota Kluger Grande Auto 2WD","2014 Toyota Kluger Grande Auto 2WD","2019 Toyota Kluger GX Auto 2WD","2021 Toyota Kluger Grande Auto eFour","2017 Toyota Kluger GXL Auto AWD","2019 Toyota Kluger GXL Auto 2WD","2014 Toyota Kluger Grande Auto 2WD","2013 Toyota Kluger Grande Auto AWD","2022 Toyota Kluger GXL Auto eFour","2019 Toyota Kluger GX Auto AWD","2010 Toyota Kluger KX-S Auto AWD","2022 Toyota Kluger Grande Auto 2WD","2021 Toyota Kluger GX Auto eFour","2017 Toyota Kluger GX Auto AWD","2015 Toyota Kluger Grande Auto AWD","2023 Toyota Kluger GX Auto eFour","2011 Toyota Kluger KX-R Auto 2WD MY11","2021 Toyota Kluger Grande Auto 2WD","2018 Toyota Kluger GXL Auto 2WD","2015 Toyota Kluger Grande Auto AWD","2022 Toyota Kluger Grande Auto 2WD","2017 Toyota Kluger GX Auto AWD","2017 Toyota Kluger GX Auto AWD","2016 Toyota Kluger Grande Auto AWD","2019 Toyota Kluger GXL Auto AWD","2021 Toyota Kluger GX Auto eFour","2021 Toyota Kluger GX Auto eFour","2019 Toyota Kluger GXL Auto AWD"],"odometer":["290,000 km","157,713 km","280,100 km","111,318 km","78,495 km","51,261 km","76,086 km","75,757 km","144,111 km","184,300 km","13,967 km","73,715 km","127,000 km","30,834 km","49,300 km","196,245 km","126,893 km","17,642 km","171,407 km","10,888 km","62,950 km","115,531 km","18,180 km","102,000 km","102,572 km","106,000 km","38,514 km","39,845 km","60,303 km","20,456 km"],"body":["SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV"],"transmission":["Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic"],"engine":["6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","2.5i/184kW Hybrid","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","2.5i/184kW Hybrid","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","2.5i/184kW Hybrid","6cyl 3.5L Petrol","6cyl 3.5L Petrol","2.5i/184kW Hybrid","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","2.5i/184kW Hybrid","2.5i/184kW Hybrid","6cyl 3.5L Petrol"]},"columns":[{"id":"price","name":"price","type":"character","style":"function(rowInfo, colInfo) {\nconst rowIndex = rowInfo.index + 1\nif (colInfo.id === 'price' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\n}","cell":["$15,000*","$33,990*","$13,800*","$33,990*","$42,990","$73,990","$38,700*","$48,990","$35,990","$20,500*","$71,990","$41,990","$20,800*","$69,990","$58,999","$27,888","$34,990*","$61,990","$16,990","$61,880","$46,990","$35,977*","$69,990*","$35,900*","$34,990*","$40,000*","$48,990","$59,888*","$57,887*","$48,000*"],"html":true,"align":"right","headerStyle":{"font-weight":"normal"}},{"id":"title","name":"title","type":"character","style":"function(rowInfo, colInfo) {\nconst rowIndex = rowInfo.index + 1\nif (colInfo.id === 'price' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\n}","cell":["2011 Toyota Kluger KX-R Auto 2WD MY11","2014 Toyota Kluger Grande Auto 2WD","2008 Toyota Kluger Grande Auto 2WD","2014 Toyota Kluger Grande Auto 2WD","2019 Toyota Kluger GX Auto 2WD","2021 Toyota Kluger Grande Auto eFour","2017 Toyota Kluger GXL Auto AWD","2019 Toyota Kluger GXL Auto 2WD","2014 Toyota Kluger Grande Auto 2WD","2013 Toyota Kluger Grande Auto AWD","2022 Toyota Kluger GXL Auto eFour","2019 Toyota Kluger GX Auto AWD","2010 Toyota Kluger KX-S Auto AWD","2022 Toyota Kluger Grande Auto 2WD","2021 Toyota Kluger GX Auto eFour","2017 Toyota Kluger GX Auto AWD","2015 Toyota Kluger Grande Auto AWD","2023 Toyota Kluger GX Auto eFour","2011 Toyota Kluger KX-R Auto 2WD MY11","2021 Toyota Kluger Grande Auto 2WD","2018 Toyota Kluger GXL Auto 2WD","2015 Toyota Kluger Grande Auto AWD","2022 Toyota Kluger Grande Auto 2WD","2017 Toyota Kluger GX Auto AWD","2017 Toyota Kluger GX Auto AWD","2016 Toyota Kluger Grande Auto AWD","2019 Toyota Kluger GXL Auto AWD","2021 Toyota Kluger GX Auto eFour","2021 Toyota Kluger GX Auto eFour","2019 Toyota Kluger GXL Auto AWD"],"html":true,"width":300,"align":"left","headerStyle":{"font-weight":"normal"}},{"id":"odometer","name":"odometer","type":"character","style":"function(rowInfo, colInfo) {\nconst rowIndex = rowInfo.index + 1\nif (colInfo.id === 'price' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\n}","cell":["290,000 km","157,713 km","280,100 km","111,318 km","78,495 km","51,261 km","76,086 km","75,757 km","144,111 km","184,300 km","13,967 km","73,715 km","127,000 km","30,834 km","49,300 km","196,245 km","126,893 km","17,642 km","171,407 km","10,888 km","62,950 km","115,531 km","18,180 km","102,000 km","102,572 km","106,000 km","38,514 km","39,845 km","60,303 km","20,456 km"],"html":true,"align":"left","headerStyle":{"font-weight":"normal"}},{"id":"body","name":"body","type":"character","style":"function(rowInfo, colInfo) {\nconst rowIndex = rowInfo.index + 1\nif (colInfo.id === 'price' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\n}","cell":["SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV","SUV"],"html":true,"align":"left","headerStyle":{"font-weight":"normal"}},{"id":"transmission","name":"transmission","type":"character","style":"function(rowInfo, colInfo) {\nconst rowIndex = rowInfo.index + 1\nif (colInfo.id === 'price' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\n}","cell":["Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic"],"html":true,"align":"left","headerStyle":{"font-weight":"normal"}},{"id":"engine","name":"engine","type":"character","style":"function(rowInfo, colInfo) {\nconst rowIndex = rowInfo.index + 1\nif (colInfo.id === 'price' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'title' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'body' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\n}","cell":["6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","2.5i/184kW Hybrid","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","2.5i/184kW Hybrid","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","2.5i/184kW Hybrid","6cyl 3.5L Petrol","6cyl 3.5L Petrol","2.5i/184kW Hybrid","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","2.5i/184kW Hybrid","2.5i/184kW Hybrid","6cyl 3.5L Petrol"],"html":true,"align":"left","headerStyle":{"font-weight":"normal"}}],"resizable":true,"defaultPageSize":5,"showPageSizeOptions":false,"pageSizeOptions":[10,25,50,100],"paginationType":"numbers","showPagination":true,"showPageInfo":true,"minRows":1,"highlight":true,"compact":true,"nowrap":true,"showSortable":true,"height":"auto","theme":{"color":"#333333","backgroundColor":"#FFFFFF","stripedColor":"rgba(128,128,128,0.05)","style":{"fontFamily":"system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif"},"headerStyle":{"borderTopStyle":"solid","borderTopWidth":"2px","borderTopColor":"#D3D3D3","borderBottomStyle":"solid","borderBottomWidth":"2px","borderBottomColor":"#D3D3D3"}},"elementId":"qnxgxzutkf","dataKey":"7f39f5ec34509d2f1f597a368023aed5"},"children":[]},"class":"reactR_markup"},"evals":["tag.attribs.columns.0.style","tag.attribs.columns.1.style","tag.attribs.columns.2.style","tag.attribs.columns.3.style","tag.attribs.columns.4.style","tag.attribs.columns.5.style"],"jsHooks":[]}&lt;/script>
&lt;/div>
&lt;p>Now some housekeeping: the price and odometer are strings with a dollar sign, so we need to convert these integers. We also create a new &lt;em>megametre&lt;/em> variable (i.e. thousands of kilometers) which will the variable we use in our model. The year, model, and drivetrain are extracted out of the title of the advert using regex.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">kluger_data &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
kluger_data &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(
odometer &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">parse_number&lt;/span>(odometer),
odometer_Mm &lt;span style="color:#f92672">=&lt;/span> odometer &lt;span style="color:#f92672">/&lt;/span> &lt;span style="color:#ae81ff">1000&lt;/span>,
price &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">parse_number&lt;/span>(price),
year &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">as.integer&lt;/span>( &lt;span style="color:#a6e22e">str_extract&lt;/span>(title, &lt;span style="color:#e6db74">&amp;#34;^(\\d{4})&amp;#34;&lt;/span>, group &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#66d9ef">TRUE&lt;/span>) ),
drivetrain &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">str_extract&lt;/span>(title, &lt;span style="color:#e6db74">&amp;#34;\\w+$&amp;#34;&lt;/span>),
model &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">str_extract&lt;/span>(title, &lt;span style="color:#e6db74">&amp;#34;Toyota Kluger ([-\\w]+)&amp;#34;&lt;/span>, group &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#66d9ef">TRUE&lt;/span>)
)
&lt;/code>&lt;/pre>&lt;/div>&lt;div id="unrejpwnqq" class=".gt_table" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
&lt;style>#unrejpwnqq table {
font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
&amp;#10;#unrejpwnqq thead, #unrejpwnqq tbody, #unrejpwnqq tfoot, #unrejpwnqq tr, #unrejpwnqq td, #unrejpwnqq th {
border-style: none;
}
&amp;#10;#unrejpwnqq p {
margin: 0;
padding: 0;
}
&amp;#10;#unrejpwnqq .gt_table {
display: table;
border-collapse: collapse;
line-height: normal;
margin-left: auto;
margin-right: auto;
color: #333333;
font-size: 16px;
font-weight: normal;
font-style: normal;
background-color: #FFFFFF;
width: auto;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #A8A8A8;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #A8A8A8;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
}
&amp;#10;#unrejpwnqq .gt_caption {
padding-top: 4px;
padding-bottom: 4px;
}
&amp;#10;#unrejpwnqq .gt_title {
color: #333333;
font-size: 125%;
font-weight: initial;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
border-bottom-color: #FFFFFF;
border-bottom-width: 0;
}
&amp;#10;#unrejpwnqq .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
padding-top: 3px;
padding-bottom: 5px;
padding-left: 5px;
padding-right: 5px;
border-top-color: #FFFFFF;
border-top-width: 0;
}
&amp;#10;#unrejpwnqq .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#unrejpwnqq .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#unrejpwnqq .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
&amp;#10;#unrejpwnqq .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
padding-left: 5px;
padding-right: 5px;
overflow-x: hidden;
}
&amp;#10;#unrejpwnqq .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
padding-top: 0;
padding-bottom: 0;
padding-left: 4px;
padding-right: 4px;
}
&amp;#10;#unrejpwnqq .gt_column_spanner_outer:first-child {
padding-left: 0;
}
&amp;#10;#unrejpwnqq .gt_column_spanner_outer:last-child {
padding-right: 0;
}
&amp;#10;#unrejpwnqq .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 5px;
overflow-x: hidden;
display: inline-block;
width: 100%;
}
&amp;#10;#unrejpwnqq .gt_spanner_row {
border-bottom-style: hidden;
}
&amp;#10;#unrejpwnqq .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
text-align: left;
}
&amp;#10;#unrejpwnqq .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: middle;
}
&amp;#10;#unrejpwnqq .gt_from_md > :first-child {
margin-top: 0;
}
&amp;#10;#unrejpwnqq .gt_from_md > :last-child {
margin-bottom: 0;
}
&amp;#10;#unrejpwnqq .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
margin: 10px;
border-top-style: solid;
border-top-width: 1px;
border-top-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
overflow-x: hidden;
}
&amp;#10;#unrejpwnqq .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#unrejpwnqq .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
vertical-align: top;
}
&amp;#10;#unrejpwnqq .gt_row_group_first td {
border-top-width: 2px;
}
&amp;#10;#unrejpwnqq .gt_row_group_first th {
border-top-width: 2px;
}
&amp;#10;#unrejpwnqq .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#unrejpwnqq .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
&amp;#10;#unrejpwnqq .gt_first_summary_row.thick {
border-top-width: 2px;
}
&amp;#10;#unrejpwnqq .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#unrejpwnqq .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#unrejpwnqq .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: double;
border-top-width: 6px;
border-top-color: #D3D3D3;
}
&amp;#10;#unrejpwnqq .gt_last_grand_summary_row_top {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: double;
border-bottom-width: 6px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#unrejpwnqq .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
&amp;#10;#unrejpwnqq .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
&amp;#10;#unrejpwnqq .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#unrejpwnqq .gt_footnote {
margin: 0px;
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#unrejpwnqq .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
&amp;#10;#unrejpwnqq .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
&amp;#10;#unrejpwnqq .gt_left {
text-align: left;
}
&amp;#10;#unrejpwnqq .gt_center {
text-align: center;
}
&amp;#10;#unrejpwnqq .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
&amp;#10;#unrejpwnqq .gt_font_normal {
font-weight: normal;
}
&amp;#10;#unrejpwnqq .gt_font_bold {
font-weight: bold;
}
&amp;#10;#unrejpwnqq .gt_font_italic {
font-style: italic;
}
&amp;#10;#unrejpwnqq .gt_super {
font-size: 65%;
}
&amp;#10;#unrejpwnqq .gt_footnote_marks {
font-size: 75%;
vertical-align: 0.4em;
position: initial;
}
&amp;#10;#unrejpwnqq .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
&amp;#10;#unrejpwnqq .gt_indent_1 {
text-indent: 5px;
}
&amp;#10;#unrejpwnqq .gt_indent_2 {
text-indent: 10px;
}
&amp;#10;#unrejpwnqq .gt_indent_3 {
text-indent: 15px;
}
&amp;#10;#unrejpwnqq .gt_indent_4 {
text-indent: 20px;
}
&amp;#10;#unrejpwnqq .gt_indent_5 {
text-indent: 25px;
}
&lt;/style>
&lt;div style="font-family:system-ui, &amp;#39;Segoe UI&amp;#39;, Roboto, Helvetica, Arial, sans-serif;border-top-style:solid;border-top-width:2px;border-top-color:#D3D3D3;">
&lt;div class="gt_heading gt_title gt_font_normal" style="text-size:bigger;">Kluger Market Data&lt;/div>
&lt;div class="gt_heading gt_subtitle ">&lt;/div>
&lt;/div>
&lt;div id="unrejpwnqq" class="reactable html-widget " style="width:auto;height:auto;">&lt;/div>
&lt;script type="application/json" data-for="unrejpwnqq">{"x":{"tag":{"name":"Reactable","attribs":{"data":{"year":[2017,2021,2021,2015,2009,2017,2015,2013,2016,2017,2021,2018,2017,2021,2019,2015,2010,2021,2022,2018,2021,2016,2016,2022,2022,2018,2012,2012,2016,2017],"model":["GX","Grande","Grande","Grande","KX-S","Grande","GX","KX-R","GXL","GXL","GX","GX","GX","Grande","Black","GXL","KX-S","Grande","Grande","Grande","Grande","GX","GXL","GXL","GXL","Grande","Grande","Grande","Grande","Grande"],"price":[30998,66994,72850,30000,13989,42500,24900,16988,33950,32500,44888,37990,34800,76990,47950,31888,19500,64888,75499,45000,78888,33990,34990,63989,69990,40996,27888,20990,45990,43550],"odometer":[119075,23000,19250,158900,283996,79610,114150,197927,112579,176500,52531,170427,153800,25976,51324,91187,159975,29165,19500,119000,11400,87534,94595,12368,5544,111850,164454,219311,35645,86167],"transmission":["Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic"],"engine":["6cyl 3.5L Petrol","6cyl 3.5L Petrol","2.5i/184kW Hybrid","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","2.5i/184kW Hybrid","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","2.5i/184kW Hybrid","6cyl 3.5L Petrol","2.5i/184kW Hybrid","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol"],"drivetrain":["AWD","AWD","eFour","AWD","2WD","AWD","2WD","2WD","2WD","2WD","2WD","AWD","AWD","eFour","2WD","2WD","MY11","2WD","eFour","AWD","eFour","2WD","2WD","AWD","AWD","2WD","MY12","MY12","2WD","AWD"]},"columns":[{"id":"year","name":"year","type":"numeric","style":"function(rowInfo, colInfo) {\nconst rowIndex = rowInfo.index + 1\nif (colInfo.id === 'year' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\n}","cell":["2017","2021","2021","2015","2009","2017","2015","2013","2016","2017","2021","2018","2017","2021","2019","2015","2010","2021","2022","2018","2021","2016","2016","2022","2022","2018","2012","2012","2016","2017"],"html":true,"align":"right","headerStyle":{"font-weight":"normal"}},{"id":"model","name":"model","type":"character","style":"function(rowInfo, colInfo) {\nconst rowIndex = rowInfo.index + 1\nif (colInfo.id === 'year' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\n}","cell":["GX","Grande","Grande","Grande","KX-S","Grande","GX","KX-R","GXL","GXL","GX","GX","GX","Grande","Black","GXL","KX-S","Grande","Grande","Grande","Grande","GX","GXL","GXL","GXL","Grande","Grande","Grande","Grande","Grande"],"html":true,"align":"left","headerStyle":{"font-weight":"normal"}},{"id":"price","name":"price","type":"numeric","style":"function(rowInfo, colInfo) {\nconst rowIndex = rowInfo.index + 1\nif (colInfo.id === 'year' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\n}","cell":["30,998","66,994","72,850","30,000","13,989","42,500","24,900","16,988","33,950","32,500","44,888","37,990","34,800","76,990","47,950","31,888","19,500","64,888","75,499","45,000","78,888","33,990","34,990","63,989","69,990","40,996","27,888","20,990","45,990","43,550"],"html":true,"align":"right","headerStyle":{"font-weight":"normal"}},{"id":"odometer","name":"odometer","type":"numeric","style":"function(rowInfo, colInfo) {\nconst rowIndex = rowInfo.index + 1\nif (colInfo.id === 'year' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\n}","cell":["119,075","23,000","19,250","158,900","283,996","79,610","114,150","197,927","112,579","176,500","52,531","170,427","153,800","25,976","51,324","91,187","159,975","29,165","19,500","119,000","11,400","87,534","94,595","12,368","5,544","111,850","164,454","219,311","35,645","86,167"],"html":true,"align":"right","headerStyle":{"font-weight":"normal"}},{"id":"transmission","name":"transmission","type":"character","style":"function(rowInfo, colInfo) {\nconst rowIndex = rowInfo.index + 1\nif (colInfo.id === 'year' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\n}","cell":["Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic","Automatic"],"html":true,"align":"left","headerStyle":{"font-weight":"normal"}},{"id":"engine","name":"engine","type":"character","style":"function(rowInfo, colInfo) {\nconst rowIndex = rowInfo.index + 1\nif (colInfo.id === 'year' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\n}","cell":["6cyl 3.5L Petrol","6cyl 3.5L Petrol","2.5i/184kW Hybrid","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","2.5i/184kW Hybrid","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","2.5i/184kW Hybrid","6cyl 3.5L Petrol","2.5i/184kW Hybrid","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol","6cyl 3.5L Petrol"],"html":true,"align":"left","headerStyle":{"font-weight":"normal"}},{"id":"drivetrain","name":"drivetrain","type":"character","style":"function(rowInfo, colInfo) {\nconst rowIndex = rowInfo.index + 1\nif (colInfo.id === 'year' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'year' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'model' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'price' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'odometer' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'transmission' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'engine' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 1) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 2) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 3) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 4) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 5) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 6) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 7) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 8) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 9) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 10) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 11) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 12) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 13) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 14) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 15) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 16) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 17) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 18) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 19) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 20) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 21) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 22) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 23) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 24) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 25) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 26) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 27) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 28) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 29) {\n return { fontSize: '10' }\n}\n\nif (colInfo.id === 'drivetrain' &amp; rowIndex === 30) {\n return { fontSize: '10' }\n}\n\n}","cell":["AWD","AWD","eFour","AWD","2WD","AWD","2WD","2WD","2WD","2WD","2WD","AWD","AWD","eFour","2WD","2WD","MY11","2WD","eFour","AWD","eFour","2WD","2WD","AWD","AWD","2WD","MY12","MY12","2WD","AWD"],"html":true,"align":"left","headerStyle":{"font-weight":"normal"}}],"resizable":true,"defaultPageSize":5,"showPageSizeOptions":false,"pageSizeOptions":[10,25,50,100],"paginationType":"numbers","showPagination":true,"showPageInfo":true,"minRows":1,"highlight":true,"compact":true,"nowrap":true,"showSortable":true,"height":"auto","theme":{"color":"#333333","backgroundColor":"#FFFFFF","stripedColor":"rgba(128,128,128,0.05)","style":{"fontFamily":"system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif"},"headerStyle":{"borderTopStyle":"solid","borderTopWidth":"2px","borderTopColor":"#D3D3D3","borderBottomStyle":"solid","borderBottomWidth":"2px","borderBottomColor":"#D3D3D3"}},"elementId":"unrejpwnqq","dataKey":"ae9c7fc6fcfc11b7f125a232c2ff1f40"},"children":[]},"class":"reactR_markup"},"evals":["tag.attribs.columns.0.style","tag.attribs.columns.1.style","tag.attribs.columns.2.style","tag.attribs.columns.3.style","tag.attribs.columns.4.style","tag.attribs.columns.5.style","tag.attribs.columns.6.style"],"jsHooks":[]}&lt;/script>
&lt;/div>
&lt;h1 id="taking-a-quick-look">Taking a Quick Look&lt;/h1>
&lt;p>Let’s visualise key features of the data. The one I think will be most relevant is how the price is affected by the odometer reading.&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2023-09-28-honest-insurance-company/index_files/figure-html/unnamed-chunk-13-1.png" width="672" />
Nothing too surprising here, there more kilometers, the less the sell price. But what we notice is the shape: it looks suspiciously like there’s some sort of negative exponential relationship between the the odometer and price. What if, rather than looking odometer versus price, we look and odometer versus log(price):&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2023-09-28-honest-insurance-company/index_files/figure-html/unnamed-chunk-14-1.png" width="672" />
There’s some good news, and some bad news here. The good news is that the log transform has given us a linear relationship between the two variables, simplifying our modelling.&lt;/p>
&lt;p>The bad news comes is that the data looks to be heteroskedastic, meaning its variance changes across the odometer ranges. This won’t affect our linear model’s parameters, but will affect our ability to use the model to predict the price. We’ll persevere nonetheless.&lt;/p>
&lt;p>There’s nice interpretation of the linear model when using a log transformation. When you fit a line \(y = \alpha + \beta x\), the slope \(\beta\) is “the change in y given a change of one unit of the x”. But when you fit a line to to \(log(y) = \alpha + \beta x\), for small \(\beta\), \(e^\beta\) is ‘the &lt;strong>percentage&lt;/strong> change in y for a one unit change of x’.&lt;/p>
&lt;p>Where’s the the other variance coming from? Here’s the same view, but we split it out by model:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2023-09-28-honest-insurance-company/index_files/figure-html/unnamed-chunk-15-1.png" width="672" />&lt;/p>
&lt;p>It barely needs to be stated, but the Kluger model has an impact on the sale price.&lt;/p>
&lt;h1 id="modelling">Modelling&lt;/h1>
&lt;p>Let’s start the modelling by thinking about the generative process for the price. Our observed variables odometer, year, model, and drivetrain are likely going to have an affect on price. There are some unobserved variables, such as the condition of the car its popularity that would also have an affect. There’s some &lt;a href="https://en.wikipedia.org/wiki/Confounding">confounds&lt;/a> that may need to be dealt with as well: year directly affects price, but also goes through the odometer (older cards are more likely to have more kilometres). Model affects price, but also goes through the drivetrain (certain models have certain drivetrains).&lt;/p>
&lt;p>The best way to visualise this is using the directed acyclic graph (DAG):&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2023-09-28-honest-insurance-company/index_files/figure-html/unnamed-chunk-16-1.png" width="672" />&lt;/p>
&lt;p>While I do have these variables available to me, I’d like to start with the most simple model I can: log of the price predicted by the odometer (in megametres). In doing this I’m leaving a lot of variability on the table, so the model’s ability to predict is likely going to be hampered. But better to start simple and build up.&lt;/p>
&lt;p>At this point I could rip out a standard linear regression, but where’s the sport in that? Instead, I’ll use this as an opportunity to model this in a Bayesian manner.&lt;/p>
&lt;h1 id="bayesian-modeling">Bayesian Modeling&lt;/h1>
&lt;p>I’m going to be using &lt;a href="https://mc-stan.org/">Stan&lt;/a> to perform the modelling, executing it from R using the &lt;a href="https://mc-stan.org/cmdstanr/">cmdstanr&lt;/a> package. Here’s the Stan program:&lt;/p>
&lt;pre>&lt;code>data {
int&amp;lt;lower=0&amp;gt; n;
vector[n] odometer_Mm;
vector[n] price;
}
parameters {
real a;
real b;
real&amp;lt;lower=0&amp;gt; sigma;
}
model {
log(price) ~ normal(a + b * odometer_Mm, sigma);
}
generated quantities {
array[n] real y_s = normal_rng(a + b * odometer_Mm, sigma);
real price_pred = exp( normal_rng(a + b * 60, sigma) );
}
&lt;/code>&lt;/pre>
&lt;p>It should be relatively easy to read: the data is our observed odometer (in megametres) and price, the parameters we’re looking to find are &lt;em>a&lt;/em> (for alpha, the intercept), &lt;em>b&lt;/em> (for beta, the slope), and &lt;em>sigma&lt;/em> (our variance). Our likelihood is a linear regression with a mean basedon on a, b and the odometer, and a standard deviation of sigma.&lt;/p>
&lt;p>There’s a conspicuous absence of priors for our parameters. If the priors are not specified, they’re improper priors, and will be considered flat \(Uniform(- \infty, +\infty)\), with the exception of sigma for which we have defined a lower bound of 0 in the &lt;em>data&lt;/em> section. Flat priors aren’t ideal, as there is some prior information I think we can build into the model (\ \ is unlikely to be positive). But I’m going to do a bit of hand-waving and not put stake in the ground at this stage.&lt;/p>
&lt;p>We’ll talk about the generated quantities a little later; these are used for validating the predictive capability of the model.&lt;/p>
&lt;p>Let’s run the model with the data to get an estimate of our posterior distributions:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">kluger_fit &lt;span style="color:#f92672">&amp;lt;-&lt;/span> kluger_model&lt;span style="color:#f92672">$&lt;/span>&lt;span style="color:#a6e22e">sample&lt;/span>(
data &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">compose_data&lt;/span>(kluger_data),
seed &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">123&lt;/span>,
chains &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">4&lt;/span>,
parallel_chains &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">4&lt;/span>,
refresh &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">500&lt;/span>,
)
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>Running MCMC with 4 parallel chains...
Chain 1 Iteration: 1 / 2000 [ 0%] (Warmup)
Chain 2 Iteration: 1 / 2000 [ 0%] (Warmup)
Chain 3 Iteration: 1 / 2000 [ 0%] (Warmup)
Chain 4 Iteration: 1 / 2000 [ 0%] (Warmup)
Chain 2 Iteration: 500 / 2000 [ 25%] (Warmup)
Chain 3 Iteration: 500 / 2000 [ 25%] (Warmup)
Chain 1 Iteration: 500 / 2000 [ 25%] (Warmup)
Chain 4 Iteration: 500 / 2000 [ 25%] (Warmup)
Chain 2 Iteration: 1000 / 2000 [ 50%] (Warmup)
Chain 2 Iteration: 1001 / 2000 [ 50%] (Sampling)
Chain 3 Iteration: 1000 / 2000 [ 50%] (Warmup)
Chain 3 Iteration: 1001 / 2000 [ 50%] (Sampling)
Chain 4 Iteration: 1000 / 2000 [ 50%] (Warmup)
Chain 1 Iteration: 1000 / 2000 [ 50%] (Warmup)
Chain 1 Iteration: 1001 / 2000 [ 50%] (Sampling)
Chain 4 Iteration: 1001 / 2000 [ 50%] (Sampling)
Chain 2 Iteration: 1500 / 2000 [ 75%] (Sampling)
Chain 3 Iteration: 1500 / 2000 [ 75%] (Sampling)
Chain 4 Iteration: 1500 / 2000 [ 75%] (Sampling)
Chain 1 Iteration: 1500 / 2000 [ 75%] (Sampling)
Chain 2 Iteration: 2000 / 2000 [100%] (Sampling)
Chain 2 finished in 3.9 seconds.
Chain 3 Iteration: 2000 / 2000 [100%] (Sampling)
Chain 3 finished in 4.3 seconds.
Chain 4 Iteration: 2000 / 2000 [100%] (Sampling)
Chain 4 finished in 4.9 seconds.
Chain 1 Iteration: 2000 / 2000 [100%] (Sampling)
Chain 1 finished in 5.3 seconds.
All 4 chains finished successfully.
Mean chain execution time: 4.6 seconds.
Total execution time: 5.4 seconds.
&lt;/code>&lt;/pre>
&lt;h1 id="assessing-the-model">Assessing the Model&lt;/h1>
&lt;p>What’s Stan done for us? It’s used Hamiltonian Monte Carlo to take samples from an estimate of our posterior distribution for each of our parameters &lt;em>a&lt;/em>, &lt;em>b&lt;/em>, and &lt;em>sigma&lt;/em>. Now we take a (cursory) look at whether the sampling has converged, or whether some of the sampling has wandered off into a strange place.&lt;/p>
&lt;p>The &lt;em>trace plot&lt;/em> is the first diagnositc tool to pull out. We want these to look like “fuzzy caterpillars”, showing that each chain is exploring the distribution in a similar way, and isn’t wandering off on its own for too long.&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2023-09-28-honest-insurance-company/index_files/figure-html/unnamed-chunk-20-1.png" width="672" />
These look pretty good. That’s as much diagnostics as we’ll do for the sake of this article; for more serious tasks you’d likely look at additional convergence tests such as effective sample size and R-hat.&lt;/p>
&lt;p>Taking the draws/samples from the posterior and plotting as a histogram we see the distribution of values that each of our four chains has come up with:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2023-09-28-honest-insurance-company/index_files/figure-html/unnamed-chunk-21-1.png" width="672" />
The first thing to notice it that, on the whole, each one looks Gaussian. Secondly, each of the chains has a similar shape, meaning they’ve all explored similar parts of the posterior. The intercept &lt;em>a&lt;/em> has a mean approximately 11.13, and the slope &lt;em>b&lt;/em> has a mean of approximately -0.0063. These are on a log scale, so exponentiating each of these values tells us that the average price with zero on tne odometer is ~$66k, and for every 1,000km the average price is ~99.3% what it was before the 1,000km were driven.&lt;/p>
&lt;p>We’re not dealing with point estimates as we would with a linear regression, we’ve got an (estimate) of the posterior distribution. As such, so there’s no single line to plot. Sure we can take the mean, but we could also use the median or mode as well. To visualise the regression we take each of our draws and plot it as a line, effectively giving us the confidence intervals for the &lt;em>a&lt;/em> and &lt;em>b&lt;/em> parameters.&lt;/p>
&lt;p>&lt;img src="index_files/figure-html/unnamed-chunk-22-1.gif" alt="">&lt;!-- -->&lt;/p>
&lt;p>The confidence intervals are not very wide (which we saw in the histograms above). Looking at the 89% interval of the slope parameter we see it’s between -0.0061194 and -0.0064036. Exponentiating this and turning into percentages, we find that the plausible range for the decrease in price per 1,000km driver is between 0.610% and 0.638%.&lt;/p>
&lt;p>The bad news is news we really already knew: the variance &lt;em>sigma&lt;/em> around our line is large and it looks to be non-constant across the odometer values. A check to see how well this performed is posterior prediction, in which we bring &lt;em>sigma&lt;/em> into the equation.&lt;/p>
&lt;p>Recall the following line in our Stan program:&lt;/p>
&lt;pre>&lt;code>array[n] real y_s = normal_rng(a + b * odometer_Mm, sigma);
&lt;/code>&lt;/pre>
&lt;p>What we’re doing here is using the parameter posterior distributions, our model, and generating random draws to create a &lt;em>posterior predictive distribution&lt;/em> for each odometer value. The idea is that, if our model is ‘good’, we should get values back that look like the data we’re modelling. In the plot below, I’ve filtered out values below 4.5% and below 94.5% to give an 89% prediction interval.&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2023-09-28-honest-insurance-company/index_files/figure-html/unnamed-chunk-23-1.png" width="672" />
What we find is that the the predictive value of our simple linear model is not great. At odometer values close to zero it’s too conservative, with the all the prices falling well inside our predicted bands in light blue. At the other end of the scale the model is too confident, with many of the real observationsfalling outside of our predictive bands.&lt;/p>
&lt;p>Another way to look at this is to look at the densities for both our model and the real values as the odometer values change. You can think of this as sitting &lt;em>on&lt;/em> the above graph’s x-y plane, moving backwards along the odometer and looking at 10,000km slices of odometer values:&lt;/p>
&lt;p>&lt;img src="index_files/figure-html/echo-1.gif" alt="">&lt;!-- -->
This gives us a nice comparison of the models proablity density compared to the real data. At the start all of the data fits inside the 89% PI, and as we move along odometer values, the log(price) gets wider and falls outside outside of the modoels predicted area.&lt;/p>
&lt;p>Despite these obvious flaws, let’s see how the model performs answering our original question: “what is the market sell price for a Toyota Kluger with 60,000kms on the odometer?” I’m using this line from the &lt;em>generated values&lt;/em> section of the Stan model:&lt;/p>
&lt;pre>&lt;code>real price_pred = exp( normal_rng(a + b * 60, sigma) );
&lt;/code>&lt;/pre>
&lt;p>This uses parameters drawn from the posterior distribution, but fixes the odometer value at 60 megametres, and exponentiates to give us the price rather than log(price).&lt;/p>
&lt;p>Here’s the resulting distribution of prices with an 89% confidence interval (5.5% and 94.5% quaniles):&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2023-09-28-honest-insurance-company/index_files/figure-html/unnamed-chunk-24-1.png" width="672" />
That’s a large spread, with an 89% interval between $33,878.97 and $65,575.44. That’s too large to be of any use to us in validating the market value the insurance company gave me for my car.&lt;/p>
&lt;h1 id="summary">Summary&lt;/h1>
&lt;p>We’ve been on quite a journey in this article: from gathering and visualising data, to trying and validating new Bayesian modelling approach, to finally generating prediction intervals for the Kluger prices. All this, and at the end we’ve got nothing to show for it?&lt;/p>
&lt;p>Well, not quite. Yes we’ve got a very simple model that doesn’t perform very well, but it is a foundation. From this, we can start to bring in other predictors that influence price. The next step might be to look at a hierarchical model that brings in the model of the Kluger. Maybe we can also find some data on the condition of the car? Sounds like a good idea for another post!&lt;/p></description></item><item><title>Telling a Spatial Story</title><link>https://clt.blog.foletta.net/post/spatial_story/</link><pubDate>Thu, 09 Mar 2023 00:00:00 +0000</pubDate><guid>https://clt.blog.foletta.net/post/spatial_story/</guid><description>&lt;p>My friend Jen is writing a thesis and recently reached out to me to see if I could help. She had an upcoming presentation up wanted to add some visualisation to it to better tell the story. I jumped at the opportunity as it was an opportunity to familiarise myself with an area I hadn&amp;rsquo;t previously explored: geospatial data.&lt;/p>
&lt;p>In this short article I&amp;rsquo;ll take you through the creation of the visualisation. What I hope it shows is that the transition from raw data to something that tells a story can be done elegantly and with a relatively small amount of code.&lt;/p>
&lt;h1 id="whats-the-story">What&amp;rsquo;s the Story?&lt;/h1>
&lt;p>Jen&amp;rsquo;s thesis is on post-war migration into the inner-northern suburbs of Melbourne, Australia. Using census data from the 50s, 60s, and 70, she wanted to communicate this migration, specifically the increase in concentrations on a per suburb basis and how how migration changed geographically over these decades.&lt;/p>
&lt;p>Jen had provided me with the data, and I thought the best way to communicate this was to present the data on the geography of Melbourne, animating the changes between each census year.&lt;/p>
&lt;h1 id="step-1-the-data">Step 1: The Data&lt;/h1>
&lt;p>Jen sent me the migration data in an Excel spreadsheet, an I manually changed it into a tidy format. Were I doing this on an ongoing basis I would have scripted this tidying for reproducibility, but due to time constraints the manual method was quicker and easier.&lt;/p>
&lt;p>Here&amp;rsquo;s the first few rows of the data showing the year of the census, the suburb, and the total number and percentage of the population born overseas.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">migrant_data &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
&lt;span style="color:#a6e22e">read.xlsx&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;data/migrant_population_growth.xlsx&amp;#39;&lt;/span>, sheet &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Tidied&amp;#39;&lt;/span>) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">as_tibble&lt;/span>()
migrant_data &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">arrange&lt;/span>(Suburb) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">head&lt;/span>(&lt;span style="color:#ae81ff">10&lt;/span>)
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>## # A tibble: 10 × 4
## Year Suburb Total Percentage
## &amp;lt;dbl&amp;gt; &amp;lt;chr&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt;
## 1 1961 Altona 3973 0.246
## 2 1966 Altona 7186 0.287
## 3 1971 Altona 8777 0.287
## 4 1971 Broadmeadows 20355 0.201
## 5 1954 Brunswick 6603 0.123
## 6 1961 Brunswick 15746 0.297
## 7 1966 Brunswick 19013 0.366
## 8 1971 Brunswick 20352 0.395
## 9 1954 Caulfield 12727 0.153
## 10 1971 Caulfield 16696 0.204
&lt;/code>&lt;/pre>&lt;p>The next step was to get the geospatial data for suburbs. Thankfully the Australian government has &lt;a href="https://data.gov.au/dataset/ds-dga-af33dd8c-0534-4e18-9245-fc64440f742e/distribution/dist-dga-4d6ec8bb-1039-4fef-aa58-6a14438f29b1/details?q=">shapefile data&lt;/a> available. Here&amp;rsquo;s a render of the full content of the geospatial data:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">vic_localities &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">read_sf&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;data/VIC_LOC_POLYGON_shp GDA2020/vic_localities.shp&amp;#39;&lt;/span>)
vic_localities &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">ggplot&lt;/span>() &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">geom_sf&lt;/span>(size &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">.1&lt;/span>)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;img src="https://clt.blog.foletta.net/post/spatial_story/index_files/figure-html/unnamed-chunk-2-1.png" width="672" />&lt;/p>
&lt;p>These borders are local government areas for the entire state of Victoria, Australia. The borders of these localities aren&amp;rsquo;t going to be &lt;em>exactly&lt;/em> the same now as they were when the censuses were performed, but it&amp;rsquo;ll be good enough for our purposes. We&amp;rsquo;re only interested in the inner-Melbourne area, so we crop this to the relevant latitudes and longitudes:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">inner_melb_localities &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
vic_localities &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">st_crop&lt;/span>(xmin&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#ae81ff">144.7&lt;/span>, xmax&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#ae81ff">145.1&lt;/span>, ymin&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#ae81ff">-37.95&lt;/span>, ymax&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#ae81ff">-37.6&lt;/span>)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;img src="https://clt.blog.foletta.net/post/spatial_story/index_files/figure-html/unnamed-chunk-4-1.png" width="672" />&lt;/p>
&lt;p>With our two key pieces of data in place, we can start to put them together.&lt;/p>
&lt;h1 id="step-2---data-wrangling">Step 2 - Data Wrangling&lt;/h1>
&lt;p>Next step is to merge our migration data with our geospatial data using the suburb as our key, however it&amp;rsquo;s a little more complex that a simple join. As we want to ensure for every year we have all of the geospatial information so as to render the full map, we need to do a bit of &lt;code>group()&lt;/code>ing and &lt;code>nest()&lt;/code>ing.&lt;/p>
&lt;p>The way I&amp;rsquo;ve tackled this is as such:&lt;/p>
&lt;ul>
&lt;li>Group by each year.&lt;/li>
&lt;li>Nest the data based on these groups.&lt;/li>
&lt;li>Perform a join on this nested data with the spatial data.&lt;/li>
&lt;li>This way we ensure that for each year, the spatial data for each suburb is present and the map is complete.&lt;/li>
&lt;li>Unnest the data&lt;/li>
&lt;li>For suburbs that don&amp;rsquo;t have any data for a particular census, we assign them 0 in the &lt;strong>Percetnage&lt;/strong> and &lt;strong>Total&lt;/strong> columns.
&lt;ul>
&lt;li>I talked to Jen about this, and she made a decision in the short term to replace these with 0.&lt;/li>
&lt;li>Were more rigor required for the visualisation, other options would need to be considered to better portray this missing data.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Convert to a geospatial object.&lt;/li>
&lt;/ul>
&lt;p>Below is the full pipeline, interspersed with comments to help understand what&amp;rsquo;s happening in each stage:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">migrant_data_geo &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
migrant_data &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">group_by&lt;/span>(Year) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">nest&lt;/span>() &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#75715e"># On a per census year basis, join each year&amp;#39;s data with the spatial data&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(
geo &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">map&lt;/span>(data, &lt;span style="color:#f92672">~&lt;/span>{ &lt;span style="color:#a6e22e">right_join&lt;/span>(.x, inner_melb_localities, by &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">c&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;Suburb&amp;#39;&lt;/span> &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;LOC_NAME&amp;#39;&lt;/span>)) })
) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">unnest&lt;/span>(geo) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">arrange&lt;/span>(Suburb) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#75715e"># Replace the NAs due to missing data with zero&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(
Percentage &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">replace_na&lt;/span>(Percentage, &lt;span style="color:#ae81ff">0&lt;/span>),
Total &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">replace_na&lt;/span>(Total, &lt;span style="color:#ae81ff">0&lt;/span>),
) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">select&lt;/span>(&lt;span style="color:#f92672">-&lt;/span>data) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">ungroup&lt;/span>() &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#75715e"># Convert back to a &amp;#39;simple features&amp;#39; (geosptatial) object&lt;/span>
&lt;span style="color:#a6e22e">st_as_sf&lt;/span>()
&lt;/code>&lt;/pre>&lt;/div>&lt;h1 id="step-3---rendering">Step 3 - Rendering&lt;/h1>
&lt;p>The final step is the easiest: the rendering. The map is rendered with the fill of each polygon representing the percentage of population born overseas. This is then animated, with the fill transitioning smoothly between each &lt;em>Year&lt;/em>&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">migrant_data_geo_animation &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
migrant_data_geo &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">ggplot&lt;/span>() &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">geom_sf&lt;/span>(&lt;span style="color:#a6e22e">aes&lt;/span>(fill &lt;span style="color:#f92672">=&lt;/span> Percentage)) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">labs&lt;/span>(
title &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#34;Melbourne - Percentage Residents Born Overseas&amp;#34;&lt;/span>,
subtitle &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#34;Census Year: {closest_state}&amp;#34;&lt;/span>
) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">theme&lt;/span>(
plot.title &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">element_text&lt;/span>(size &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">10&lt;/span>),
plot.subtitle &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">element_text&lt;/span>(size &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">8&lt;/span>),
legend.title &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">element_text&lt;/span>(size &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">8&lt;/span>),
legend.text&lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">element_text&lt;/span>(size &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">6&lt;/span>)
) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">scale_fill_distiller&lt;/span>(name &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#34;Percent&amp;#34;&lt;/span>, trans &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;reverse&amp;#39;&lt;/span>, labels &lt;span style="color:#f92672">=&lt;/span> percent) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">transition_states&lt;/span>(Year, transition_length &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">3&lt;/span>, state_length &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">5&lt;/span>)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>What we have in the end is what I handed over to Jen for her presentation: an animation showing the change in overseas-born population in inner-city Melbourne from 1954 to 1971.&lt;/p>
&lt;p>&lt;img src="index_files/figure-html/unnamed-chunk-7-1.gif" alt="">&lt;!-- -->&lt;/p>
&lt;h1 id="is-that-it">Is That It?&lt;/h1>
&lt;p>Is this visualisation perfect? Far from it. There&amp;rsquo;s probably two areas where it could be improved. The first (and least important) is the aesthetics of it; I think it could simply look better. Better fonts, better arrangement, different colours.&lt;/p>
&lt;p>But more importantly I think there that there are choices to be made about how the information is presented. The suburbs with no information have been zeroed out, is that the right choice? Does the colour scale accurately convey the change, or does it need to have multiple colours in it? Do the suburbs need to be labelled? What exactly is the definition of &amp;ldquo;inner-north Melbourne&amp;rdquo;?&lt;/p>
&lt;p>All of those are decisions best made by someone with domain-specific knowledge, not necessarily by the person who&amp;rsquo;s generating the visualisation. Regardless, if we&amp;rsquo;re looking at this through an 80/20 lens, I contiue to be amazed about the 80 that can be generated with only a few lines of R code.&lt;/p></description></item><item><title>A Brief Tour of Lebesgue Curves</title><link>https://clt.blog.foletta.net/post/lebesgue-curves/</link><pubDate>Sun, 30 Oct 2022 00:00:00 +0000</pubDate><guid>https://clt.blog.foletta.net/post/lebesgue-curves/</guid><description>&lt;p>Before we start a note: this post is a sidebar for another article I&amp;rsquo;m currently writing. There aren&amp;rsquo;t any grand conclusions or deep insights, it&amp;rsquo;s more exploratory.&lt;/p>
&lt;p>Whilst writing an article on memory allocations, I needed a way to map a one-dimensional number (the memory location) on to two-dimensional space. By doing this I could visualise where in memory these allocation were occurring.&lt;/p>
&lt;p>My initial reaction was to reach for the space-filling Hilbert curve à la the &lt;a href="https://xkcd.com/195/">XKCD &amp;ldquo;Map of the Internet&amp;rdquo;&lt;/a>, but whist researching I discovered the &lt;a href="https://en.wikipedia.org/wiki/Z-order_curve">Lebesgue Curve&lt;/a>, also known as the &lt;em>Z-order&lt;/em> or &lt;em>Morton&lt;/em> curve. At first glance it looked to have reasonable locality, and its inherent binary nature meant it appeared easier to implement.&lt;/p>
&lt;p>In this article I&amp;rsquo;ll implement the Lebesgue curve and explore some of its properties.&lt;/p>
&lt;h1 id="lebesgue-curve">Lebesgue Curve&lt;/h1>
&lt;p>The Lebesgue curve maps an one-dimensional integer into integers in two or more dimensions. In can also be used in reverse to map two or more integers back into a single integer. If we get a bit fancy with our notation, we can define the Lebesgue function \(l\) as:&lt;/p>
&lt;p>$$ l : \mathbb{N} \to \mathbb{N^2} $$&lt;/p>
&lt;p>where \(\mathbb{N}\) is the set of natural numbers, including 0.&lt;/p>
&lt;p>The algorithm is relatively simple:&lt;/p>
&lt;ul>
&lt;li>Take an \(n\) bit integer \(Z\)&lt;/li>
&lt;li>Mask the even bits \([0, 2, \ldots, n - 2]\) into \(x\)&lt;/li>
&lt;li>Mask the odd bits \([1, 2, \ldots, n - 1]\) into \(y\)&lt;/li>
&lt;li>Collapse/shift the masked bits down so that they are &amp;ldquo;next&amp;rdquo; to each other
&lt;ul>
&lt;li>This results in an \(\frac{n}{2}\) bit integer&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>In the C++ code below I&amp;rsquo;ve defined the &lt;code>lebesgue_coords()&lt;/code> function that implements the above algorithm. It&amp;rsquo;s certainly not the most optimal implementation (it iterates through all the bits even if they&amp;rsquo;re 0), but it should have clarity. I&amp;rsquo;ve then vectorised it in the &lt;code>lebesgue()&lt;/code> function that returns a list of \(x\) and \(y\) coordinates for each \(z\) integer, and exported using Rcpp so it can be used in the R environment:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-cpp" data-lang="cpp">&lt;span style="color:#75715e">#include&lt;/span> &lt;span style="color:#75715e">&amp;lt;Rcpp.h&amp;gt;&lt;/span>&lt;span style="color:#75715e">
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">using&lt;/span> &lt;span style="color:#66d9ef">namespace&lt;/span> Rcpp;
&lt;span style="color:#75715e">//The x,y vertice generated from the single z value
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">struct&lt;/span> &lt;span style="color:#a6e22e">vert&lt;/span> { &lt;span style="color:#66d9ef">unsigned&lt;/span> &lt;span style="color:#66d9ef">long&lt;/span> x; &lt;span style="color:#66d9ef">unsigned&lt;/span> &lt;span style="color:#66d9ef">long&lt;/span> y; };
&lt;span style="color:#75715e">//Lebesgue calculation for a single z value
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">struct&lt;/span> &lt;span style="color:#a6e22e">vert&lt;/span> &lt;span style="color:#a6e22e">lebesgue_coords&lt;/span>(&lt;span style="color:#66d9ef">unsigned&lt;/span> &lt;span style="color:#66d9ef">long&lt;/span> z) {
&lt;span style="color:#66d9ef">struct&lt;/span> &lt;span style="color:#a6e22e">vert&lt;/span> coords;
&lt;span style="color:#66d9ef">unsigned&lt;/span> &lt;span style="color:#66d9ef">long&lt;/span> shift_mask;
&lt;span style="color:#75715e">//Mask out even bits
&lt;/span>&lt;span style="color:#75715e">&lt;/span> coords.x &lt;span style="color:#f92672">=&lt;/span> z &lt;span style="color:#f92672">&amp;amp;&lt;/span> &lt;span style="color:#ae81ff">0x55555555&lt;/span>;
&lt;span style="color:#75715e">//Mask out odd bits, then shift back
&lt;/span>&lt;span style="color:#75715e">&lt;/span> coords.y &lt;span style="color:#f92672">=&lt;/span> (z &lt;span style="color:#f92672">&amp;amp;&lt;/span> &lt;span style="color:#ae81ff">0xaaaaaaaa&lt;/span>) &lt;span style="color:#f92672">&amp;gt;&amp;gt;&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>;
&lt;span style="color:#75715e">//This bit compresses the masked out bits.
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#75715e">//i.e. 1010101 -&amp;gt; 1111
&lt;/span>&lt;span style="color:#75715e">&lt;/span> shift_mask &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">0xfffffffc&lt;/span>;
&lt;span style="color:#66d9ef">do&lt;/span> {
&lt;span style="color:#75715e">//Extract the top bits, then shift them down one
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#66d9ef">long&lt;/span> &lt;span style="color:#66d9ef">int&lt;/span> x_upper &lt;span style="color:#f92672">=&lt;/span> (coords.x &lt;span style="color:#f92672">&amp;amp;&lt;/span> shift_mask) &lt;span style="color:#f92672">&amp;gt;&amp;gt;&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>;
&lt;span style="color:#66d9ef">long&lt;/span> &lt;span style="color:#66d9ef">int&lt;/span> y_upper &lt;span style="color:#f92672">=&lt;/span> (coords.y &lt;span style="color:#f92672">&amp;amp;&lt;/span> shift_mask) &lt;span style="color:#f92672">&amp;gt;&amp;gt;&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>;
&lt;span style="color:#75715e">//Clear out the top bits from x and re-introduce
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#75715e">//the shift top bits, thereby compressing them together
&lt;/span>&lt;span style="color:#75715e">&lt;/span> coords.x &lt;span style="color:#f92672">=&lt;/span> x_upper &lt;span style="color:#f92672">|&lt;/span> (coords.x &lt;span style="color:#f92672">&amp;amp;&lt;/span> &lt;span style="color:#f92672">~&lt;/span>shift_mask) ;
coords.y &lt;span style="color:#f92672">=&lt;/span> y_upper &lt;span style="color:#f92672">|&lt;/span> (coords.y &lt;span style="color:#f92672">&amp;amp;&lt;/span> &lt;span style="color:#f92672">~&lt;/span>shift_mask);
} &lt;span style="color:#66d9ef">while&lt;/span> (shift_mask &lt;span style="color:#f92672">&amp;lt;&amp;lt;=&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>);
&lt;span style="color:#66d9ef">return&lt;/span> coords;
}
&lt;span style="color:#75715e">// [[Rcpp::export]]
&lt;/span>&lt;span style="color:#75715e">&lt;/span>List &lt;span style="color:#a6e22e">lebesgue&lt;/span>(IntegerVector z) {
&lt;span style="color:#66d9ef">int&lt;/span> i;
&lt;span style="color:#66d9ef">struct&lt;/span> &lt;span style="color:#a6e22e">vert&lt;/span> v;
IntegerVector x,y;
&lt;span style="color:#66d9ef">for&lt;/span> (i &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span>; i &lt;span style="color:#f92672">&amp;lt;&lt;/span> z.size(); i&lt;span style="color:#f92672">++&lt;/span>) {
v &lt;span style="color:#f92672">=&lt;/span> lebesgue_coords(z[i]);
x.push_back(v.x);
y.push_back(v.y);
}
&lt;span style="color:#66d9ef">return&lt;/span> List&lt;span style="color:#f92672">::&lt;/span>create(Named(&lt;span style="color:#e6db74">&amp;#34;x&amp;#34;&lt;/span>) &lt;span style="color:#f92672">=&lt;/span> x, Named(&lt;span style="color:#e6db74">&amp;#34;y&amp;#34;&lt;/span>) &lt;span style="color:#f92672">=&lt;/span> y);
}
&lt;/code>&lt;/pre>&lt;/div>&lt;p>With this function we can have a look at how this function works across the integers \([0,255]\). You should be able to see the fractal-like behaviour, with clusters of 4, 16, 64, etc:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">lebesgue_points &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
&lt;span style="color:#a6e22e">tibble&lt;/span>(z &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span> &lt;span style="color:#f92672">:&lt;/span> &lt;span style="color:#ae81ff">255&lt;/span>) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(l &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">as_tibble&lt;/span>(&lt;span style="color:#a6e22e">lebesgue&lt;/span>(z))) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">unnest&lt;/span>(l)
&lt;span style="color:#a6e22e">print&lt;/span>(lebesgue_points)
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code># A tibble: 256 × 3
z x y
&amp;lt;int&amp;gt; &amp;lt;int&amp;gt; &amp;lt;int&amp;gt;
1 0 0 0
2 1 1 0
3 2 0 1
4 3 1 1
5 4 2 0
6 5 3 0
7 6 2 1
8 7 3 1
9 8 0 2
10 9 1 2
# … with 246 more rows
&lt;/code>&lt;/pre>&lt;p>&lt;img src="index_files/figure-html/unnamed-chunk-4-1.gif" alt="">&lt;!-- -->&lt;/p>
&lt;h1 id="locality">Locality&lt;/h1>
&lt;p>I mentioned at the start that the Lebesgue has &amp;lsquo;good locality&amp;rsquo;, but what exactly does this mean? There are multiple ways to define it, with a more rigorous take in &lt;a href="https://link.springer.com/chapter/10.1007/978-3-540-24587-2_40">this paper&lt;/a>. I&amp;rsquo;ll be a little more little more hand-wavy and define it as &amp;ldquo;points that are close together in one-dimensions should be close together in two dimensions.&amp;rdquo;&lt;/p>
&lt;p>We&amp;rsquo;ll look at consecutive numbers - which have a distance of 1 in one-dimension - and compare their distance in two dimensions. More formally, we&amp;rsquo;ll determine see how far away \((x_{z},y_{z})\) is away from \((x_{z-1}, y_{z-1})\) using good old fashioned Pythagoras to determine the distance:&lt;/p>
&lt;p>$$ d = \sqrt{(x_2 - x_1)^2 + (y_2 - y_1)^2} $$
Let&amp;rsquo;s take a look at the average distance between the \(z\) values \([0,255]\):&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">lebesgue_locality &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
lebesgue_points &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(
coord_distance &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">sqrt&lt;/span>(
(x &lt;span style="color:#f92672">-&lt;/span> &lt;span style="color:#a6e22e">lag&lt;/span>(x,&lt;span style="color:#ae81ff">1&lt;/span>))^2 &lt;span style="color:#f92672">+&lt;/span>
(y &lt;span style="color:#f92672">-&lt;/span> &lt;span style="color:#a6e22e">lag&lt;/span>(y,&lt;span style="color:#ae81ff">1&lt;/span>))^2
)
) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">filter&lt;/span>(&lt;span style="color:#f92672">!&lt;/span>&lt;span style="color:#a6e22e">is.na&lt;/span>(coord_distance))
lebesgue_locality &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">summarise&lt;/span>(mean_distance &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">mean&lt;/span>(coord_distance))
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code># A tibble: 1 × 1
mean_distance
&amp;lt;dbl&amp;gt;
1 1.56
&lt;/code>&lt;/pre>&lt;p>So on average, each point is 1.56 times further away in the two dimensional representation that in the one-dimensional representation. But as we all (should) know, an average is a summary and hides specifics. Taking a look at \(z\) versus the distance paints a more accurate picture of the underlying process:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/lebesgue-curves/index_files/figure-html/unnamed-chunk-6-1.png" width="672" />
Locality is good, except every so often we get a spike of distance between points. This spike is where we&amp;rsquo;re moving between our different power-of-two regions: \(2^4, 2^6, 2^8, \ldots\). For a different perspective, we map our two-dimensional points with colour and size conveying the distance:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/lebesgue-curves/index_files/figure-html/unnamed-chunk-7-1.png" width="672" />&lt;/p>
&lt;p>We see the large outlier, but in general most points are reasonably close to each other. More importantly it should be good enough for my original purposes.&lt;/p>
&lt;h1 id="additive-properties">Additive Properties&lt;/h1>
&lt;p>Finally, we&amp;rsquo;ll look at some additive properties that yielded an interesting result. As part of the aforementioned article I am using the Lebesgue curve in, a really useful property would have been this:&lt;/p>
&lt;p>$$ l(a + b) = l(a) + l(b) $$
Unfortunately I quickly determined that this was not the case:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#a6e22e">lebesgue&lt;/span>(&lt;span style="color:#ae81ff">0&lt;/span>&lt;span style="color:#f92672">:&lt;/span>&lt;span style="color:#ae81ff">10&lt;/span> &lt;span style="color:#f92672">+&lt;/span> &lt;span style="color:#ae81ff">3&lt;/span>)&lt;span style="color:#f92672">$&lt;/span>x &lt;span style="color:#f92672">==&lt;/span> &lt;span style="color:#a6e22e">lebesgue&lt;/span>(&lt;span style="color:#ae81ff">0&lt;/span>&lt;span style="color:#f92672">:&lt;/span>&lt;span style="color:#ae81ff">10&lt;/span>)&lt;span style="color:#f92672">$&lt;/span>x &lt;span style="color:#f92672">+&lt;/span> &lt;span style="color:#a6e22e">lebesgue&lt;/span>(&lt;span style="color:#ae81ff">3&lt;/span>)&lt;span style="color:#f92672">$&lt;/span>x
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code> [1] TRUE TRUE FALSE TRUE TRUE FALSE FALSE FALSE TRUE TRUE FALSE
&lt;/code>&lt;/pre>&lt;p>As I played around with different ranges and different addends, I couldn&amp;rsquo;t discern the pattern. The next move was to visualise it to try and better understand the interaction.&lt;/p>
&lt;p>We do this in the code below. The &lt;code>crossing()&lt;/code> is a handy function to know, generating which generates all 128x128 combinations of the integers 0 through 127. For each of these pairs, we determine whether adding each combination inside the function versus individually leads to a true or false result. The result of this boolean is then visualised on each point on the graph:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">equality &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
&lt;span style="color:#a6e22e">crossing&lt;/span>(a &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span>&lt;span style="color:#f92672">:&lt;/span>&lt;span style="color:#ae81ff">127&lt;/span>, b &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span>&lt;span style="color:#f92672">:&lt;/span>&lt;span style="color:#ae81ff">127&lt;/span>) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(
x &lt;span style="color:#f92672">=&lt;/span> (&lt;span style="color:#a6e22e">lebesgue&lt;/span>(a &lt;span style="color:#f92672">+&lt;/span> b))&lt;span style="color:#f92672">$&lt;/span>x &lt;span style="color:#f92672">==&lt;/span> (&lt;span style="color:#a6e22e">lebesgue&lt;/span>(a)&lt;span style="color:#f92672">$&lt;/span>x &lt;span style="color:#f92672">+&lt;/span> &lt;span style="color:#a6e22e">lebesgue&lt;/span>(b)&lt;span style="color:#f92672">$&lt;/span>x),
y &lt;span style="color:#f92672">=&lt;/span> (&lt;span style="color:#a6e22e">lebesgue&lt;/span>(a &lt;span style="color:#f92672">+&lt;/span> b))&lt;span style="color:#f92672">$&lt;/span>y &lt;span style="color:#f92672">==&lt;/span> (&lt;span style="color:#a6e22e">lebesgue&lt;/span>(a)&lt;span style="color:#f92672">$&lt;/span>y &lt;span style="color:#f92672">+&lt;/span> &lt;span style="color:#a6e22e">lebesgue&lt;/span>(b)&lt;span style="color:#f92672">$&lt;/span>y)
) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">pivot_longer&lt;/span>(cols &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">c&lt;/span>(x, y), names_to &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;coord&amp;#39;&lt;/span>, values_to &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;equality&amp;#39;&lt;/span>)
&lt;span style="color:#a6e22e">print&lt;/span>(equality)
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code># A tibble: 32,768 × 4
a b coord equality
&amp;lt;int&amp;gt; &amp;lt;int&amp;gt; &amp;lt;chr&amp;gt; &amp;lt;lgl&amp;gt;
1 0 0 x TRUE
2 0 0 y TRUE
3 0 1 x TRUE
4 0 1 y TRUE
5 0 2 x TRUE
6 0 2 y TRUE
7 0 3 x TRUE
8 0 3 y TRUE
9 0 4 x TRUE
10 0 4 y TRUE
# … with 32,758 more rows
&lt;/code>&lt;/pre>&lt;p>&lt;img src="https://clt.blog.foletta.net/post/lebesgue-curves/index_files/figure-html/unnamed-chunk-10-1.png" width="672" />
I must admit, I was a bit stunned when I saw this pop out; it was not at all what I expected. You can see some interesting fractal behaviour, with each boolean pattern being repeated in larger and larger sections. It looks a little like a &lt;a href="https://en.wikipedia.org/wiki/Sierpi%C5%84ski_triangle">Sierpiński triangle&lt;/a>, but I&amp;rsquo;m not sure if there&amp;rsquo;s any relation. It may be somewhat anticlimactic, but I haven&amp;rsquo;t delved any deeper into this. That gets added to the ever-growing todo list.&lt;/p>
&lt;h1 id="summary">Summary&lt;/h1>
&lt;p>In this article we had a brief, exploratory look at the space-filling &amp;ldquo;Lebesgue&amp;rdquo; curve. We looked at how it&amp;rsquo;s implemented, some of its locality behaviour, and some interesting results under addition. In a future article we&amp;rsquo;ll use this algorithm to help visualise the dynamic memory allocations of a process.&lt;/p></description></item><item><title>Free WiFi with Randomness</title><link>https://clt.blog.foletta.net/post/random-wifi-password/</link><pubDate>Sun, 02 Oct 2022 00:00:00 +0000</pubDate><guid>https://clt.blog.foletta.net/post/random-wifi-password/</guid><description>&lt;p>There&amp;rsquo;s a few different pictures making their way around social media showing a complicated definite integral, and asking guests to evaluate it to get the password for free WiFi. Here&amp;rsquo;s an example:&lt;/p>
&lt;p>&lt;img src="wifi_integral.jpg" style="display: block; margin: auto;" />&lt;/p>
&lt;p>I loved calculus back at uni, but uni was a fair while ago now and I&amp;rsquo;m more than a little dusty. I briefly attempted integrating the equation, looking up some terms I could remember (integration by parts? product rule?). But then I thought: if I&amp;rsquo;m sitting in a cafe trying to get WiFi, I&amp;rsquo;m going to want a quick and dirty numerical solution, not some beautiful mathematical &amp;lsquo;proof&amp;rsquo;.&lt;/p>
&lt;p>So in this article I&amp;rsquo;m going to show you that quick and dirty solution. We&amp;rsquo;re going to use randomness via the Monte Carlo method to find a numerical solution to the definite integral, hopefully getting us access to that sweet free WiFi.&lt;/p>
&lt;h1 id="the-function">The Function&lt;/h1>
&lt;p>Let&amp;rsquo;s first define the function that is to be integrated in R, calling it \(f()\):&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">f &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">function&lt;/span>(x) {
(x^3 &lt;span style="color:#f92672">*&lt;/span> &lt;span style="color:#a6e22e">cos&lt;/span>(x&lt;span style="color:#f92672">/&lt;/span>&lt;span style="color:#ae81ff">2&lt;/span>) &lt;span style="color:#f92672">+&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>&lt;span style="color:#f92672">/&lt;/span>&lt;span style="color:#ae81ff">2&lt;/span>) &lt;span style="color:#f92672">*&lt;/span> &lt;span style="color:#a6e22e">sqrt&lt;/span>(&lt;span style="color:#ae81ff">4&lt;/span> &lt;span style="color:#f92672">-&lt;/span> x^2)
}
&lt;/code>&lt;/pre>&lt;/div>&lt;p>We&amp;rsquo;ll than calculate \(f(x)\) for values of x between [-2,2], using a small increments between each x value to ensure we&amp;rsquo;re got reasonable accuracy for our subsequent calculations.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">coords &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
&lt;span style="color:#a6e22e">tibble&lt;/span>(
x &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">seq&lt;/span>(from &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">-2&lt;/span>, to &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">2&lt;/span>, by &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">0.000001&lt;/span>)
) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(y &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">f&lt;/span>(x))
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Let&amp;rsquo;s take a look at what the function looks like:
&lt;img src="https://clt.blog.foletta.net/post/random-wifi-password/index_files/figure-html/unnamed-chunk-5-1.png" width="672" />&lt;/p>
&lt;p>There&amp;rsquo;s two areas we&amp;rsquo;ll need to take into consideration when integrating: from -2 to around -1.2, and from -1.2 to 2. We need to remember that when integrating, regions above the x-axis evaluate to positive numbers, but regions below evaluate to negative numbers. When calculating the total, we&amp;rsquo;ll have to subtract that left region away from the right region.&lt;/p>
&lt;p>We&amp;rsquo;re going to need the minimum and maximum values of \(f(x)\) in our calculation, so let&amp;rsquo;s pull them out:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">min_max &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
coords &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">summarise&lt;/span>(
min_y &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">min&lt;/span>(y),
max_y &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">max&lt;/span>(y)
)
min_max
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code># A tibble: 1 × 2
min_y max_y
&amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt;
1 -2.89 4.03
&lt;/code>&lt;/pre>&lt;p>We can now move on to integrating this function.&lt;/p>
&lt;h1 id="integration-with-randomness">Integration with Randomness&lt;/h1>
&lt;p>How are we going calculate a numerical answer to this integral? We&amp;rsquo;ll use randomness to help us:&lt;/p>
&lt;ul>
&lt;li>Draw a number of random x and y values from a uniform distribution.&lt;/li>
&lt;li>Calculate f(x) for each of the x random values.&lt;/li>
&lt;li>Determine whether f(x) above/below the random y and above/below the x-axis (both depending on the region).&lt;/li>
&lt;li>Find the ratio of points inside the areas versus points outside.&lt;/li>
&lt;li>Multiply this ratio by the total rectangular size of the area to find the definite integral area.&lt;/li>
&lt;/ul>
&lt;p>As a first step, let&amp;rsquo;s find the total area:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">total_area &lt;span style="color:#f92672">&amp;lt;-&lt;/span> (&lt;span style="color:#ae81ff">2&lt;/span> &lt;span style="color:#f92672">-&lt;/span> &lt;span style="color:#ae81ff">-2&lt;/span>) &lt;span style="color:#f92672">*&lt;/span> (min_max[[&lt;span style="color:#e6db74">&amp;#39;max_y&amp;#39;&lt;/span>]] &lt;span style="color:#f92672">-&lt;/span> min_max[[&lt;span style="color:#e6db74">&amp;#39;min_y&amp;#39;&lt;/span>]])
total_area
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>[1] 27.66986
&lt;/code>&lt;/pre>&lt;p>As mentioned before, we&amp;rsquo;ll have to be wary of the area under the x-axis. We&amp;rsquo;ll use an encoding scheme where points outside are encoded as 0, points inside the positive area are encoded as 1, and points inside the negative area are encoded as -1. When can then simply take the &lt;code>mean()&lt;/code> of these encoded values to determine the ratio of the area.&lt;/p>
&lt;p>Here&amp;rsquo;s a first pass with 50,000 points to show how it works. I&amp;rsquo;ve omitted the graph rendering code:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">ratio &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
&lt;span style="color:#a6e22e">tibble&lt;/span>(
x &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">runif&lt;/span>(&lt;span style="color:#ae81ff">50000&lt;/span>, &lt;span style="color:#ae81ff">-2&lt;/span>, &lt;span style="color:#ae81ff">2&lt;/span>),
y &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">runif&lt;/span>(&lt;span style="color:#ae81ff">50000&lt;/span>, min_max[[&lt;span style="color:#e6db74">&amp;#39;min_y&amp;#39;&lt;/span>]], min_max[[&lt;span style="color:#e6db74">&amp;#39;max_y&amp;#39;&lt;/span>]]),
fx &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">f&lt;/span>(x)
) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(
integral_encoding &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">case_when&lt;/span>(
&lt;span style="color:#75715e"># Above the x-axis, below the curve&lt;/span>
fx &lt;span style="color:#f92672">&amp;gt;&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span> &lt;span style="color:#f92672">&amp;amp;&lt;/span> y &lt;span style="color:#f92672">&amp;lt;&lt;/span> fx &lt;span style="color:#f92672">&amp;amp;&lt;/span> y &lt;span style="color:#f92672">&amp;gt;&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span> &lt;span style="color:#f92672">~&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>,
&lt;span style="color:#75715e"># Below the x-axis, above the curve&lt;/span>
fx &lt;span style="color:#f92672">&amp;lt;&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span> &lt;span style="color:#f92672">&amp;amp;&lt;/span> y &lt;span style="color:#f92672">&amp;gt;&lt;/span> fx &lt;span style="color:#f92672">&amp;amp;&lt;/span> y &lt;span style="color:#f92672">&amp;lt;&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span> &lt;span style="color:#f92672">~&lt;/span> &lt;span style="color:#ae81ff">-1&lt;/span>,
&lt;span style="color:#75715e"># Everything else&lt;/span>
&lt;span style="color:#66d9ef">TRUE&lt;/span> &lt;span style="color:#f92672">~&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span>
)
)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;img src="https://clt.blog.foletta.net/post/random-wifi-password/index_files/figure-html/unnamed-chunk-9-1.png" width="672" />&lt;/p>
&lt;p>We see the random points, and their associated encoding which tells us which area they&amp;rsquo;re in. 50,000 points probably isn&amp;rsquo;t enough for an accurate answer, so we&amp;rsquo;ll do another run with 100,000,000 points. I&amp;rsquo;ve omitted the code in this run, but the results are in &lt;code>ratio&lt;/code>. We then summarise the mean of the encoding, giving us our ratio.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">integral_ratio &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
ratio &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">summarise&lt;/span>(
ratio &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">mean&lt;/span>(integral_encoding)
) &lt;span style="color:#f92672">|&amp;gt;&lt;/span>
&lt;span style="color:#a6e22e">pull&lt;/span>(ratio)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>The calculated ratio is (&lt;em>drumroll&lt;/em>):&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">integral_ratio
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>[1] 0.1134934
&lt;/code>&lt;/pre>&lt;p>Just above 11%. Multiplying this by the total area gives us the answer we need:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">total_area &lt;span style="color:#f92672">*&lt;/span> integral_ratio
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>[1] 3.140347
&lt;/code>&lt;/pre>&lt;p>Immediately we something interesting about that number: it&amp;rsquo;s very close to Pi. Remember we&amp;rsquo;re here for a quick and dirty way to get the WiFi password, so my first guess would be the first 10 digits of Pi.&lt;/p>
&lt;p>Why didn&amp;rsquo;t we get the value? Setting aside any mistakes I may have made in my calculations above, I&amp;rsquo;d guess that the number of random points isn&amp;rsquo;t enough to get good enough precision. The reason I&amp;rsquo;ve not gone larger is I run out of memory trying to generate anything more than 100,000,000.&lt;/p>
&lt;p>I&amp;rsquo;ve been a little deceptive, as there&amp;rsquo;s actually a quick and easy way to approximate the definite integral in base R. It doesn&amp;rsquo;t make for a very exciting article on its own:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#a6e22e">integrate&lt;/span>(f, &lt;span style="color:#ae81ff">-2&lt;/span>, &lt;span style="color:#ae81ff">2&lt;/span>)
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>3.141593 with absolute error &amp;lt; 2e-09
&lt;/code>&lt;/pre>&lt;p>Clearly this implementation is far superior, being faster and using less memory. A brief investigation leads me to believe it&amp;rsquo;s using &lt;a href="https://en.wikipedia.org/wiki/Adaptive_quadrature">adaptive quadrature&lt;/a> under the hood.&lt;/p>
&lt;h1 id="whats-the-point">What&amp;rsquo;s the Point?&lt;/h1>
&lt;p>You may rightfully say &amp;ldquo;your method is slow, uses lots of resources, and isn&amp;rsquo;t even that accurate in the end: why use it?&amp;quot;. And you&amp;rsquo;d be correct, we could use the in-built integation function, or even use a 2-dimensional grid rather than random points. It&amp;rsquo;s probably not the best method.&lt;/p>
&lt;p>But for more complicated integrals in higher dimensions, things become increasingly difficult. The &lt;code>integrate()&lt;/code> function doesn&amp;rsquo;t work, and the compute resources required to use the grid method would increase exponentially by the dimension.&lt;/p>
&lt;p>By using randomness, a reasonable approximation of a definite integral can still be achieved using less compute resources and despite the complexity of the problem.&lt;/p>
&lt;h1 id="summary">Summary&lt;/h1>
&lt;p>In this post we looked at a numerical approach to calculating a definite integral. We generated random points and encoded them as to whether they were inside or outside of the integral area. We then used the ratio of this encoding and multiplied it by the total area of the region in question to get the answer.&lt;/p>
&lt;p>If you&amp;rsquo;re out and about in need of free WiFi and you see this question (or a question like it) posed to get access, don&amp;rsquo;t be intimidated: go for the quick and dirty numerical approach. Either that, or get the first 10 digits of Pi from your phone and give that a crack.&lt;/p></description></item><item><title>Git Under the Hood</title><link>https://clt.blog.foletta.net/post/git-under-the-hood/</link><pubDate>Sun, 29 May 2022 00:00:00 +0000</pubDate><guid>https://clt.blog.foletta.net/post/git-under-the-hood/</guid><description>&lt;p>While I&amp;rsquo;m not a programmer per se, I do use git almost daily and find it a great tool for source control and versioning of plain text files. But I don&amp;rsquo;t think there can be any doubt that it is &lt;a href="https://xkcd.com/1597/">not the easiest tool to use&lt;/a>. But despite its unintuitive user interface, under the hood git is quite simple and elegant. I believe that if you can understand the fundamental constructs git uses to store, track, and manage files, then the using git becomes a lot easier.&lt;/p>
&lt;p>In this article we&amp;rsquo;re going to take a look under the covers and investigate git&amp;rsquo;s fundamental constructs. We&amp;rsquo;ll start off with its storage model and look at blobs, trees and commits. We&amp;rsquo;ll see how branches are implemented, and finally we&amp;rsquo;ll unpack the git index file to understand what happens during the staging of a commit.&lt;/p>
&lt;p>There&amp;rsquo;s already &lt;a href="https://jwiegley.github.io/git-from-the-bottom-up/">loads&lt;/a> of &lt;a href="https://medium.com/hackernoon/understanding-git-fcffd87c15a3">articles&lt;/a> out there on git internals, so what&amp;rsquo;s different about this one? Two things:&lt;/p>
&lt;ol>
&lt;li>We&amp;rsquo;re going to avoid the lower level &amp;lsquo;plumbing&amp;rsquo; git commands and limit ourselves to the five most common &amp;lsquo;porcelain&amp;rsquo; commands: &lt;code>git {init, add, commit, branch, checkout}&lt;/code>. All other work will be done using standard command line utilities.&lt;/li>
&lt;li>Using the R packages &lt;a href="https://github.com/ropensci/git2r">git2r&lt;/a> and &lt;a href="https://github.com/thomasp85/tidygraph">tidygraph&lt;/a>, we&amp;rsquo;ll dynamically build up a picture of the connections between git&amp;rsquo;s objects to help understand how they are tied together.&lt;/li>
&lt;/ol>
&lt;p>As always, the source code for this article is available up on &lt;a href="https://github.com/gregfoletta/articles.foletta.org/blob/production/content/post/2022-05-30-git-under-the-hood/index.Rmarkdown">github&lt;/a>.&lt;/p>
&lt;h1 id="initialisation">Initialisation&lt;/h1>
&lt;p>We&amp;rsquo;ll start by initialising a git repository, which creates a &lt;em>.git&lt;/em> directory and some initial files. Git holds all of the files and metadata it needs for source control in this directory. To clarify things we&amp;rsquo;ll prune back as many of the initial files as possible, while still ensuring git recognises it as a valid repository.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">&lt;span style="color:#75715e"># Initialise the repository (quietly)&lt;/span>
git init --quiet
&lt;span style="color:#75715e"># Remove some of the created files and directories&lt;/span>
rm -rf .git/&lt;span style="color:#f92672">{&lt;/span>hooks,info,config,branches,description&lt;span style="color:#f92672">}&lt;/span>
rm -rf .git/objects/&lt;span style="color:#f92672">{&lt;/span>info,pack&lt;span style="color:#f92672">}&lt;/span>
rm -rf .git/refs/&lt;span style="color:#f92672">{&lt;/span>heads,tags&lt;span style="color:#f92672">}&lt;/span>
tree .git
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>.git
├── HEAD
├── objects
└── refs
2 directories, 1 file
&lt;/code>&lt;/pre>&lt;p>With a nice clean slate we can move on to our first object: the blob.&lt;/p>
&lt;h1 id="blobs">Blobs&lt;/h1>
&lt;p>The first fundamental git object we&amp;rsquo;ll look at is the &lt;em>blob&lt;/em>. If we create a file and add it to the staging area, we see what has changed in the .git directory.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">&lt;span style="color:#75715e"># Create a file&lt;/span>
echo &lt;span style="color:#e6db74">&amp;#34;Root&amp;#34;&lt;/span> &amp;gt; file_x
&lt;span style="color:#75715e"># Add it to the staging area&lt;/span>
git add file_x
tree .git
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>.git
├── HEAD
├── index
├── objects
│   └── 93
│   └── 39e13010d12194986b13e3a777ae5ec4f7c8a6
└── refs
3 directories, 3 files
&lt;/code>&lt;/pre>&lt;p>There&amp;rsquo;s two new files - an index and an object. We&amp;rsquo;ll get to the index later in the article, but for now let&amp;rsquo;s focus on the object. Checking it&amp;rsquo;s format we find that it&amp;rsquo;s compressed data:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">file .git/objects/93/39e13010d12194986b13e3a777ae5ec4f7c8a6
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>.git/objects/93/39e13010d12194986b13e3a777ae5ec4f7c8a6: zlib compressed data
&lt;/code>&lt;/pre>&lt;p>Decompressing and looking inside, we see the file is in a structured as a &amp;ldquo;type, length, value&amp;rdquo; or TLV. The type is &amp;lsquo;blob&amp;rsquo;, the length is the size of the of the data (excluding the type), and the value the contents of the file we created:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">pigz -cd .git/objects/93/39e13010d12194986b13e3a777ae5ec4f7c8a6 | hexdump -C
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>00000000 62 6c 6f 62 20 35 00 52 6f 6f 74 0a |blob 5.Root.|
0000000c
&lt;/code>&lt;/pre>&lt;p>How is the path to the blob object created? It&amp;rsquo;s simply the SHA1 hash of the blob itself. The first two hexadecimal digits are used as a folder, with the remaining digits used as the object file name. We can show this by recreating the object, taking the hash, and noting it&amp;rsquo;s the same as our object&amp;rsquo;s path:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">&lt;span style="color:#75715e"># The hash matches the object&amp;#39;s path&lt;/span>
echo &lt;span style="color:#e6db74">&amp;#34;blob 5\0Root&amp;#34;&lt;/span> | shasum
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>9339e13010d12194986b13e3a777ae5ec4f7c8a6 -
&lt;/code>&lt;/pre>&lt;p>It can therefore be helpful to understand git as a form of content addressable storage: the location of the data that is under its control is based on the content of the data itself. Let&amp;rsquo;s explore this further: we&amp;rsquo;ll add two files with the same contents, one in the root directory and one in a subdirectory.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">&lt;span style="color:#75715e"># Cerate a subdir&lt;/span>
mkdir subdir
&lt;span style="color:#75715e"># Add two files, one in the root, one in the subdir&lt;/span>
echo &lt;span style="color:#e6db74">&amp;#34;Root &amp;amp; Sub&amp;#34;&lt;/span> &amp;gt; file_y
echo &lt;span style="color:#e6db74">&amp;#34;Root &amp;amp; Sub&amp;#34;&lt;/span> &amp;gt; subdir/file_z
&lt;span style="color:#75715e"># Add the files to the staging area&lt;/span>
git add file_y subdir
tree .git
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>.git
├── HEAD
├── index
├── objects
│   ├── 93
│   │   └── 39e13010d12194986b13e3a777ae5ec4f7c8a6
│   └── cc
│   └── 23f67bb60997d9628f4fd1e9e84f92fd49780e
└── refs
4 directories, 4 files
&lt;/code>&lt;/pre>&lt;p>While we&amp;rsquo;ve added three files but there&amp;rsquo;s only two objects. Blobs only contain raw data, with no reference to files or directories. There&amp;rsquo;s only two pieces of unique data, and thus two objects. The file/directory information is stored in tree objects which we will take a look at in the next section.&lt;/p>
&lt;p>Over the course of the article we&amp;rsquo;re going to build up a graph of the objects in the git repository to gain a visual representation of the structure. Here&amp;rsquo;s our starting point: two blobs, the first four characters of their hash, and their contents:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/git-under-the-hood/index_files/figure-html/unnamed-chunk-12-1.png" width="672" />&lt;/p>
&lt;p>The two little blobs of data look pretty lonely out there. What they need is some context, and this is where the tree object comes in.&lt;/p>
&lt;h1 id="tree">Tree&lt;/h1>
&lt;p>As we&amp;rsquo;ve seen, there&amp;rsquo;s no file or directory information in the blob object. The storage of this information is the role of the tree object. Let&amp;rsquo;s perform our first commit and see what&amp;rsquo;s changed in the repository:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">git commit --quiet -m &lt;span style="color:#e6db74">&amp;#34;First Commit&amp;#34;&lt;/span>
tree .git
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>.git
├── COMMIT_EDITMSG
├── HEAD
├── index
├── logs
│   ├── HEAD
│   └── refs
│   └── heads
│   └── master
├── objects
│   ├── 36
│   │   └── 58bfd8a7cda8ee50181497ab8ec4e699428877
│   ├── 4e
│   │   └── eafbc980bb5cc210392fa9712eeca32ded0f7d
│   ├── 67
│   │   └── 21ae08f27ae139ec833f8ab14e3361c38d07bd
│   ├── 93
│   │   └── 39e13010d12194986b13e3a777ae5ec4f7c8a6
│   └── cc
│   └── 23f67bb60997d9628f4fd1e9e84f92fd49780e
└── refs
└── heads
└── master
11 directories, 11 files
&lt;/code>&lt;/pre>&lt;p>A lot has changed and at first glance it may be a bit overwhelming, but focus in on the &lt;em>objects&lt;/em> subdirectory. There&amp;rsquo;s an additional three objects on top of the two blob objects we saw before. What are they? If you&amp;rsquo;ll forgive me, I&amp;rsquo;ll use a very ugly set of commands to pull out the type and length from each of the objects:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">&lt;span style="color:#75715e"># May god have mercy on my soul&lt;/span>
find .git/objects -type f -exec sh -c &lt;span style="color:#ae81ff">\
&lt;/span>&lt;span style="color:#ae81ff">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;echo -n &amp;#39;{} -&amp;gt; &amp;#39; &amp;amp;&amp;amp; pigz -cd {} | perl -0777 -nE &amp;#39;say unpack qw(Z*)&amp;#39;&amp;#34;&lt;/span> &lt;span style="color:#ae81ff">\;&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>.git/objects/cc/23f67bb60997d9628f4fd1e9e84f92fd49780e -&amp;gt; blob 11
.git/objects/36/58bfd8a7cda8ee50181497ab8ec4e699428877 -&amp;gt; commit 175
.git/objects/4e/eafbc980bb5cc210392fa9712eeca32ded0f7d -&amp;gt; tree 101
.git/objects/67/21ae08f27ae139ec833f8ab14e3361c38d07bd -&amp;gt; tree 34
.git/objects/93/39e13010d12194986b13e3a777ae5ec4f7c8a6 -&amp;gt; blob 5
&lt;/code>&lt;/pre>&lt;p>So in addition to our two blobs, we&amp;rsquo;ve got two trees and a commit. Our starting point for will be the first of the tree objects. Unlike the others, trees contain some binary information rather than UTF-8 strings. I&amp;rsquo;ll use Perl&amp;rsquo;s &lt;code>unpack()&lt;/code> function so decode this into hexadecimal:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">pigz -cd .git/objects/4e/eafbc980bb5cc210392fa9712eeca32ded0f7d |&lt;span style="color:#ae81ff">\
&lt;/span>&lt;span style="color:#ae81ff">&lt;/span>perl -nE &lt;span style="color:#e6db74">&amp;#39;print join &amp;#34;\n&amp;#34;, unpack(&amp;#34;Z*(Z*H40)*&amp;#34;)&amp;#39;&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>tree 101
100644 file_x
9339e13010d12194986b13e3a777ae5ec4f7c8a6
100644 file_y
cc23f67bb60997d9628f4fd1e9e84f92fd49780e
40000 subdir
6721ae08f27ae139ec833f8ab14e3361c38d07bd
&lt;/code>&lt;/pre>&lt;p>Like the blob we have the type and length of the object. Following this there&amp;rsquo;s an entry for each of the two files and the subdirectory that reside in the root directory. The digits preceding the file name are the &lt;em>mode&lt;/em>, capturing the type of filesystem object (regular file/symbolic link/directory) and its permissions. Git doesn&amp;rsquo;t track all combinations of permissions, only whether the file is user executable or not.&lt;/p>
&lt;p>Each of the the two file entries point to the hashes of the blob objects, but the subdirectory points to the hash of another tree object. Looking inside that tree object:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">pigz -cd .git/objects/67/21ae08f27ae139ec833f8ab14e3361c38d07bd |&lt;span style="color:#ae81ff">\
&lt;/span>&lt;span style="color:#ae81ff">&lt;/span>perl -nE &lt;span style="color:#e6db74">&amp;#39;print join &amp;#34;\n&amp;#34;, unpack(&amp;#34;Z*(Z*H40)*&amp;#34;)&amp;#39;&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>tree 34
100644 file_z
cc23f67bb60997d9628f4fd1e9e84f92fd49780e
&lt;/code>&lt;/pre>&lt;p>We see this has an entry for the &lt;em>file_z&lt;/em> in the subdirectory, and this points to the same hash as the &lt;em>file_y&lt;/em> entry in the previous tree. A graph should make this clearer:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/git-under-the-hood/index_files/figure-html/unnamed-chunk-17-1.png" width="672" />
At the top is the root of the tree, with vertices to the two blobs (&lt;em>file_x&lt;/em> and &lt;em>file_y&lt;/em>) and a tree (the subdirectory). The second tree is has a single vertex to a blob (&lt;em>file_z&lt;/em>). The root and subdirectory trees both point to the same blob, because both files have the same contents.&lt;/p>
&lt;p>With blobs and trees we&amp;rsquo;ve built up a pseudo-filesystem, but how does this help us with source control? The object that ties all of this together is the commit.&lt;/p>
&lt;h1 id="commit">Commit&lt;/h1>
&lt;p>Our third object - and the one most visible to the end user - is the commit. We&amp;rsquo;re going to open our commit up, but as we&amp;rsquo;ll learn about shortly, the hash of the commit (and thus the path to the object) depends partially on the time the commit was made. As I generate this article dynamically, there&amp;rsquo;s a little bit of overhead to get the commit object:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">&lt;span style="color:#75715e"># Get the commit object&lt;/span>
COMMIT_DIR&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#66d9ef">$(&lt;/span>cut -c 1-2 .git/refs/heads/master&lt;span style="color:#66d9ef">)&lt;/span>
COMMIT_OBJ&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#66d9ef">$(&lt;/span>cut -c 3- .git/refs/heads/master&lt;span style="color:#66d9ef">)&lt;/span>
&lt;span style="color:#75715e"># Open it up&lt;/span>
pigz -cd .git/objects/$COMMIT_DIR/$COMMIT_OBJ |&lt;span style="color:#ae81ff">\
&lt;/span>&lt;span style="color:#ae81ff">&lt;/span>perl -0777 -nE &lt;span style="color:#e6db74">&amp;#39;print join &amp;#34;\n&amp;#34;, unpack(&amp;#34;Z*A*&amp;#34;)&amp;#39;&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>commit 175
tree 4eeafbc980bb5cc210392fa9712eeca32ded0f7d
author Greg Foletta &amp;lt;greg@foletta.org&amp;gt; 1654027280 +1000
committer Greg Foletta &amp;lt;greg@foletta.org&amp;gt; 1654027280 +1000
First Commit
&lt;/code>&lt;/pre>&lt;p>Again we have the type and length of the object at the start, then there&amp;rsquo;s a few different pieces of information.&lt;/p>
&lt;p>The first is reference to a tree object. This is the tree object that represents the root directory of the repository. The next is the author of the commit, with their name, email address, commit time and the UTC offset. As the person who authored the commit doesn&amp;rsquo;t necessarily have to be the person who committed it to the repository, there&amp;rsquo;s also a line for the committer. Following this is the commit message, which is free text input entered at the time of the commit.&lt;/p>
&lt;p>Let&amp;rsquo;s place this on our graph:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/git-under-the-hood/index_files/figure-html/unnamed-chunk-19-1.png" width="672" />
The commit points to the tree object representing the root directory. But this first commit is a special commit, missing a piece of information: parents. Every other commit in this repository from this point onwards will point to one or more parent entries. This creates a &lt;em>directed acyclic graph&lt;/em> (DAG), allowing any commit to be traced back to the first commit. As every commit&amp;rsquo;s hash is dependent on its parents&amp;rsquo; hash, the graph is also a form of &lt;a href="https://en.wikipedia.org/">Merkle tree&lt;/a>.&lt;/p>
&lt;p>If we change the contents of &lt;em>file_z&lt;/em>, stage, and create a second commit, will allow us to see this additional information in the commit object:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">&lt;span style="color:#75715e"># Change the contents, add, and create a second commit&lt;/span>
echo &lt;span style="color:#e6db74">&amp;#34;Root Changed&amp;#34;&lt;/span> &amp;gt; file_x
git add file_x
git commit -q -m &lt;span style="color:#e6db74">&amp;#34;Second Commit&amp;#34;&lt;/span>
&lt;span style="color:#75715e"># Determine the path to the second commit object&lt;/span>
COMMIT_DIR&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#66d9ef">$(&lt;/span>cut -c 1-2 .git/refs/heads/master&lt;span style="color:#66d9ef">)&lt;/span>
COMMIT_OBJ&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#66d9ef">$(&lt;/span>cut -c 3- .git/refs/heads/master&lt;span style="color:#66d9ef">)&lt;/span>
&lt;span style="color:#75715e"># Unpack the contents of the commit&lt;/span>
pigz -cd .git/objects/$COMMIT_DIR/$COMMIT_OBJ |
perl -0777 -nE &lt;span style="color:#e6db74">&amp;#39;print join &amp;#34;\n&amp;#34;, unpack(&amp;#34;Z*A*&amp;#34;)&amp;#39;&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>commit 224
tree 6e09d0dbb13d342d66580c40a49dd1583958ccc8
parent 3658bfd8a7cda8ee50181497ab8ec4e699428877
author Greg Foletta &amp;lt;greg@foletta.org&amp;gt; 1654027282 +1000
committer Greg Foletta &amp;lt;greg@foletta.org&amp;gt; 1654027282 +1000
Second Commit
&lt;/code>&lt;/pre>&lt;p>We can then place the second commit onto our graph, omitting the link back to the parent (we&amp;rsquo;ll get to that in the next section):&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/git-under-the-hood/index_files/figure-html/unnamed-chunk-21-1.png" width="672" />
What we see here represents the core of how git stores data. The commits represent &amp;lsquo;snapshots&amp;rsquo; of the state of the files and directories in the repository. In this particular scenario commits point to different root tree objects, each of which point to different objects representing the &lt;em>file_x&lt;/em>. But they share the same tree object representing the subdirectory and the blob representing &lt;em>file_y&lt;/em>.&lt;/p>
&lt;p>There&amp;rsquo;s no &amp;lsquo;diffs&amp;rsquo; calculated between commits; if one byte changes in a file, a new blob is created, resulting in a new tree (or trees), resulting in a new commit. In terms of the &lt;a href="https://en.wikipedia.org/wiki/Space%E2%80%93time_tradeoff">space-time trade-off&lt;/a>, git chooses space over time, resulting in a simple model of data changes over time.&lt;/p>
&lt;p>The problem is, our commits are still addressed via a 160 bit hash. This is all well and good for a computer, but we&amp;rsquo;d like something a bit more human friendly. This is the role of branches.&lt;/p>
&lt;h1 id="branches-and-head">Branches and HEAD&lt;/h1>
&lt;p>Branches are relatively simple: they are a human-friendly, named pointer to a commit hash. Local branches (as opposed to remote branches, which we won&amp;rsquo;t cover in this article) are stored in &lt;em>.git/refs/heads/&lt;/em>, and right now the current master&lt;sup id="fnref:1">&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref">1&lt;/a>&lt;/sup> branch points to the hash of our second commit:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">cat .git/refs/heads/master
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>89ec2b06b21f25cdbd763924c751c8b24886d5c2
&lt;/code>&lt;/pre>&lt;p>We also need to briefly mention &lt;em>HEAD&lt;/em>. This file tracks which commit is currently &amp;lsquo;active&amp;rsquo;, i.e. the checked out files match those in the commit. We see &lt;em>HEAD&lt;/em> currently refers to our master branch&lt;sup id="fnref:2">&lt;a href="#fn:2" class="footnote-ref" role="doc-noteref">2&lt;/a>&lt;/sup>:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">cat .git/HEAD
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>ref: refs/heads/master
&lt;/code>&lt;/pre>&lt;p>If we create a new branch, it will recurse on what HEAD points to until it finds a commit:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">&lt;span style="color:#75715e"># Create a new branch&lt;/span>
git branch branch_2
&lt;span style="color:#75715e"># List the branches and the hashes they point to&lt;/span>
find .git/refs/heads/* -type f -exec sh -c &lt;span style="color:#e6db74">&amp;#39;echo -n &amp;#34;{} -&amp;gt; &amp;#34; &amp;amp;&amp;amp; cat {}&amp;#39;&lt;/span> &lt;span style="color:#ae81ff">\;&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>.git/refs/heads/branch_2 -&amp;gt; 89ec2b06b21f25cdbd763924c751c8b24886d5c2
.git/refs/heads/master -&amp;gt; 89ec2b06b21f25cdbd763924c751c8b24886d5c2
&lt;/code>&lt;/pre>&lt;p>When a new commit is issued, the current branch is moved to point to the new commit (and head will indirectly point to the commit through this branch):&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">&lt;span style="color:#75715e"># Checkout a branch and commit on it&lt;/span>
git checkout -q branch_2
echo $RANDOM &amp;gt; file_x
git commit -q -am &lt;span style="color:#e6db74">&amp;#34;Third Commit (branch_2)&amp;#34;&lt;/span>
&lt;span style="color:#75715e"># The &amp;#39;new_branch&amp;#39; branch now points to a different commit.&lt;/span>
find .git/refs/heads/* -type f -exec sh -c &lt;span style="color:#e6db74">&amp;#39;echo -n &amp;#34;{} -&amp;gt; &amp;#34; &amp;amp;&amp;amp; cat {}&amp;#39;&lt;/span> &lt;span style="color:#ae81ff">\;&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>.git/refs/heads/branch_2 -&amp;gt; 54d46328e009399d656f158c04df8ad9c2b24cf6
.git/refs/heads/master -&amp;gt; 89ec2b06b21f25cdbd763924c751c8b24886d5c2
&lt;/code>&lt;/pre>&lt;p>If we checkout the master branch and creating a new commit, we ca visualise how the two branches have diverged:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">&lt;span style="color:#75715e"># Commit back on the master branch&lt;/span>
git checkout -q master
echo $RANDOM &amp;gt; file_x
git commit -q -am &lt;span style="color:#e6db74">&amp;#34;Fourth Commit (master)&amp;#34;&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;img src="https://clt.blog.foletta.net/post/git-under-the-hood/index_files/figure-html/unnamed-chunk-27-1.png" width="672" />&lt;/p>
&lt;p>The third and fourth commits are both descendants of the third commit. Adding in our branches to the graph we see they point to the tip of this graph:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/git-under-the-hood/index_files/figure-html/unnamed-chunk-28-1.png" width="672" />
There&amp;rsquo;s nothing stopping us from creating more branches, even if they point to the same location:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">&lt;span style="color:#75715e"># Create new branches&lt;/span>
git branch branch_3
git branch branch_4
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;img src="https://clt.blog.foletta.net/post/git-under-the-hood/index_files/figure-html/unnamed-chunk-30-1.png" width="672" />&lt;/p>
&lt;p>The key takeaway is that branches provide a human-friendly way of navigating the git&amp;rsquo;s commit DAG.&lt;/p>
&lt;h1 id="index">Index&lt;/h1>
&lt;p>The final file we&amp;rsquo;re going to look at is the index. While it has a few purposes, we&amp;rsquo;ll focus on it&amp;rsquo;s main role as the &amp;lsquo;staging area&amp;rsquo;. The exact structure is &lt;a href="https://git-scm.com/docs/index-format">available here&lt;/a>, and I&amp;rsquo;ve converted this into Perl &lt;em>unpack()&lt;/em> language:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">perl -0777 -nE &lt;span style="color:#e6db74">&amp;#39;
&lt;/span>&lt;span style="color:#e6db74"># Extract out each file in the index
&lt;/span>&lt;span style="color:#e6db74">my @index = unpack(&amp;#34;A4 H8 N (N4 N2 n B16 N N N H40 B16 Z*)&amp;#34;);
&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">say &amp;#34;Index Header: &amp;#34; . join &amp;#34; &amp;#34;, @index[0..2];
&lt;/span>&lt;span style="color:#e6db74">say &amp;#34;lstat() info: &amp;#34; . join &amp;#34; &amp;#34;, @index[3..12];
&lt;/span>&lt;span style="color:#e6db74">say &amp;#34;Object &amp;amp; Filepath: &amp;#34; . join &amp;#34; &amp;#34;, @index[13..16];
&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span> .git/index
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>Index Header: DIRC 00000002 3
lstat() info: 1654027283 118849971 1654027283 118849971 64769 7734889 0 1000000110100100 1000 1000
Object &amp;amp; Filepath: 6 511c5ae2b662376b23658fe922231d824d4e03e6 0000000000000110 file_x
&lt;/code>&lt;/pre>&lt;p>The first line shows the the four byte &amp;lsquo;DIRC&amp;rsquo; signature (which stands for &amp;lsquo;directory cache&amp;rsquo;), the version number, and the number of entries (files in the index). We&amp;rsquo;ve only unpacked one of the files.&lt;/p>
&lt;p>The first fields contain information from the &lt;code>lstat(2)&lt;/code> function: last changed and modified time, the device and inode, permissions, uid and gid, and file size. These values allow git to quickly determine if files in the working tree have been modified.&lt;/p>
&lt;p>Then comes the hash of the object, a flags field (which includes the length of path), and the path to the object.&lt;/p>
&lt;p>If we recall back in the &lt;em>blobs&lt;/em> section, when we added a file to the staging are via &lt;code>git add&lt;/code>, the index was created. Let&amp;rsquo;s modify &lt;em>file_x&lt;/em> and add it to the staging area:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">echo &lt;span style="color:#e6db74">&amp;#34;Index Modification&amp;#34;&lt;/span> &amp;gt; file_x
git add file_x
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Now we&amp;rsquo;ll re-take a look at the index:&lt;/p>
&lt;pre>&lt;code>ctime, mtime: 1654027286 1654027286
object, filepath: db12d29ef25db0f954787c6d620f1f6e9ce3c778 file_x
&lt;/code>&lt;/pre>&lt;p>The create and modify times have changed, and so has the object that &lt;em>file_x&lt;/em> points to. If a &lt;code>git commit&lt;/code> is issued, the tree underlying the commit is based upon the current state of this index. When a different branch is checked out, the index is rebuilt so that the files in index point to the correct &lt;em>blobs&lt;/em> for that particular commit.&lt;/p>
&lt;h1 id="conclusion">Conclusion&lt;/h1>
&lt;p>In this article we dived in to the internals of git. We&amp;rsquo;ve looked at git&amp;rsquo;s data model and learned about the raw blobs of data, the trees that hold the filesystem information, and the commits that point the root of the tree. We&amp;rsquo;ve seen how commits have parents, creating a directed acyclic grapg of differing states of the repository.&lt;/p>
&lt;p>Branches were shown to be quite simple, human-readable pointers to commit hashes, allowing us to navigate around git&amp;rsquo;s commit graph. Finally we cracked open index, which is a staging area for the next commit.&lt;/p>
&lt;p>It&amp;rsquo;s perhaps a sad indictment when you have to understand the internals of something in order to use it. This could be bad design, the inherent complexity of the problem domain, or perhaps a little bit of both. Nevertheless git is a widely used tool that&amp;rsquo;s incredibly powerful, and investing some time to understand it will surely pay dividends down the line when want to manage configuration, code, or almost any other type of text-based content.&lt;/p>
&lt;section class="footnotes" role="doc-endnotes">
&lt;hr>
&lt;ol>
&lt;li id="fn:1" role="doc-endnote">
&lt;p>The default branch, which is now configurable in git version 2.28 and above &lt;a href="#fnref:1" class="footnote-backref" role="doc-backlink">&amp;#x21a9;&amp;#xfe0e;&lt;/a>&lt;/p>
&lt;/li>
&lt;li id="fn:2" role="doc-endnote">
&lt;p>HEAD doesn&amp;rsquo;t doesn&amp;rsquo;t have to refer to a branch, it can refer to an arbitrary commit. This is known as a &amp;lsquo;detatched HEAD&amp;rsquo;. &lt;a href="#fnref:2" class="footnote-backref" role="doc-backlink">&amp;#x21a9;&amp;#xfe0e;&lt;/a>&lt;/p>
&lt;/li>
&lt;/ol>
&lt;/section></description></item><item><title>A Noisy Wind Tunnel</title><link>https://clt.blog.foletta.net/post/noisy-wind-tunnel/</link><pubDate>Sat, 29 Jan 2022 00:00:00 +0000</pubDate><guid>https://clt.blog.foletta.net/post/noisy-wind-tunnel/</guid><description>&lt;p>I raced bikes as a junior and came back to it after a twenty year hiatus. One of the biggest contrasts I&amp;rsquo;ve seen in the sport is the proliferation of bike sensors. All I used to have &amp;lsquo;back in the day&amp;rsquo; was a simple computer with velocity and cadence. Now I&amp;rsquo;ve got that, plus position via GPS, power, heart rate, pedal smoothness and balance, all collected and displayed on my phone.&lt;/p>
&lt;p>It&amp;rsquo;s all well and good to collect and visualise this data, but surely there was more I could do with it? After much thought, I realised I could use it determine the aerodynamic efficiency of different positions on the bike. So in this article I&amp;rsquo;m going to attempt to answer the following question:&lt;/p>
&lt;blockquote>
&lt;p>How much more aerodynamically efficient is it to ride with your hands on the drops of the handlebars, rather than the tops?&lt;/p>
&lt;/blockquote>
&lt;p>There are two main sections of the article: in the first section we look at how the data was generated, loaded into R, and transformed into a state that&amp;rsquo;s ready for analysis. It&amp;rsquo;s in this section where we see R really shine, with a simple and element method of transforming XML into a rectangular, tidy data format.&lt;/p>
&lt;p>In the second section we define an aerodynamic model (or more accurately reuse a common model), then perform a simple regression of this data to determine the aerodynamic properties. Diagnostics on the highlight a key missing element, forcing us to update the model to get better estimates of the aerodynamic properties.&lt;/p>
&lt;h1 id="data-acquisition">Data Acquisition&lt;/h1>
&lt;p>We&amp;rsquo;ll first look at how the experiment was set up and how the data was captured. A track bike (with has a single, fixed gear) was ridden around the &lt;a href="https://www.google.com/maps/@-37.7297305,144.9553304,147m/data=!3m1!1e3">Coburg velodrome&lt;/a> which is a 250m outdoor track. A sensor (Wahoo speed) on the hub of the wheel collected the velocity, and the pedals (PowerTap P1s) collected power and cadence.&lt;/p>
&lt;p>Data was gathered while in two different positions on the bike&lt;sup id="fnref:1">&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref">1&lt;/a>&lt;/sup>. The first position which we will call being on the &amp;lsquo;tops&amp;rsquo; looked similar to this (sans brake levers):&lt;/p>
&lt;p>&lt;img src="tops.jpg" style="width:40%;height:40%;" style="display: block; margin: auto;" />&lt;/p>
&lt;p>The second position which we will call being on the &amp;lsquo;drops&amp;rsquo; looked like this:&lt;/p>
&lt;p>&lt;img src="drops.jpg" style="width:40%;height:40%;" style="display: block; margin: auto;" />&lt;/p>
&lt;p>For each position the pace was slowly increasing from 10km/h to to 45km/h in approximately 10km/h increments. For each increment level, the pace was held as close as possible to constant for two laps, increasing to three laps for higher velocities to try and get enough samples.&lt;/p>
&lt;p>The experimental environemnt is far from clean, with two main external elements affecting our data generation process: wind, and the lumpyness of the velodrome. Because we are moving around and oval, both of these external elements will add noise to the data, but shouldn&amp;rsquo;t bias it in any one direction. If there was any biasing effect it would be from wind gusts.&lt;/p>
&lt;p>What results are we expecting? We&amp;rsquo;re expecting better aerodynamics when in the drops position due to two factors: a reduction in the front on surface area, and a more streamlined shape.&lt;/p>
&lt;h1 id="transforming-the-data">Transforming the Data&lt;/h1>
&lt;p>The data is downloaded in TCX (Training Center XML) format. While good for us that it&amp;rsquo;s in a standard structured format, it&amp;rsquo;s not quite in the rectangular, tidy data that we need for our analysis. The first step is therefore to extract and transform it. The XML is is made up of a a single &lt;em>activity&lt;/em> with multiple &lt;em>laps&lt;/em>. Each &lt;em>lap&lt;/em> has &lt;em>trackpoints&lt;/em> which contain a timestamp and the data collected (velocity, power, heartrate, etc). A trackpoint is taken every one second.&lt;/p>
&lt;p>You can look at the full file &lt;a href="cycle_data.tcx">here&lt;/a>, but below is a high level overview of the structure:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-xml" data-lang="xml">&lt;span style="color:#f92672">&amp;lt;TrainingCenterDatabase&amp;gt;&lt;/span>
&lt;span style="color:#f92672">&amp;lt;Activities&amp;gt;&lt;/span>
&lt;span style="color:#f92672">&amp;lt;Activity&amp;gt;&lt;/span>
&lt;span style="color:#f92672">&amp;lt;Lap&amp;gt;&lt;/span>
&lt;span style="color:#f92672">&amp;lt;Track&amp;gt;&lt;/span>
&lt;span style="color:#f92672">&amp;lt;Trackpoint&amp;gt;&lt;/span>
&lt;span style="color:#f92672">&amp;lt;Time&amp;gt;&lt;/span>2022-01-16T00:00:41Z&lt;span style="color:#f92672">&amp;lt;/Time&amp;gt;&lt;/span>
&lt;span style="color:#f92672">&amp;lt;DistanceMeters&amp;gt;&lt;/span>1.48&lt;span style="color:#f92672">&amp;lt;/DistanceMeters&amp;gt;&lt;/span>
&lt;span style="color:#f92672">&amp;lt;HeartRateBpm&amp;gt;&lt;/span>
&lt;span style="color:#f92672">&amp;lt;Value&amp;gt;&lt;/span>105&lt;span style="color:#f92672">&amp;lt;/Value&amp;gt;&lt;/span>
&lt;span style="color:#f92672">&amp;lt;/HearthRateBpm&amp;gt;&lt;/span>
&lt;span style="color:#f92672">&amp;lt;Cadence&amp;gt;&lt;/span>32&lt;span style="color:#f92672">&amp;lt;/Cadence&amp;gt;&lt;/span>
&lt;span style="color:#f92672">&amp;lt;Extensions&amp;gt;&lt;/span>
&lt;span style="color:#f92672">&amp;lt;TPX&amp;gt;&lt;/span>
&lt;span style="color:#f92672">&amp;lt;Speed&amp;gt;&lt;/span>3.19&lt;span style="color:#f92672">&amp;lt;/Speed&amp;gt;&lt;/span>
&lt;span style="color:#f92672">&amp;lt;Watts&amp;gt;&lt;/span>56&lt;span style="color:#f92672">&amp;lt;/Watts&amp;gt;&lt;/span>
&lt;span style="color:#f92672">&amp;lt;/TPX&amp;gt;&lt;/span>
&lt;span style="color:#f92672">&amp;lt;/Extensions&amp;gt;&lt;/span>
&lt;span style="color:#f92672">&amp;lt;/Trackpoint&amp;gt;&lt;/span>
&lt;span style="color:#75715e">&amp;lt;!-- Multiple trackpoints (1 second per sample) --&amp;gt;&lt;/span>
&lt;span style="color:#f92672">&amp;lt;/Track&amp;gt;&lt;/span>
&lt;span style="color:#f92672">&amp;lt;/Lap&amp;gt;&lt;/span>
&lt;span style="color:#75715e">&amp;lt;!-- Multiple laps (generated manually) --&amp;gt;&lt;/span>
&lt;span style="color:#f92672">&amp;lt;/Activity&amp;gt;&lt;/span>
&lt;span style="color:#f92672">&amp;lt;/Activities&amp;gt;&lt;/span>
&lt;span style="color:#f92672">&amp;lt;/TrainingCenterDatabase&amp;gt;&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Thanks to the XML2 library, XPath queries, the vectorised nature of R, extracting and transforming this data is relatively easy:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">cycle_data &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
&lt;span style="color:#a6e22e">read_xml&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;cycle_data.tcx&amp;#39;&lt;/span>) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">xml_ns_strip&lt;/span>() &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">xml_find_all&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;.//Trackpoint[Extensions]&amp;#39;&lt;/span>) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
{
&lt;span style="color:#a6e22e">tibble&lt;/span>(
time &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">xml_find_first&lt;/span>(., &lt;span style="color:#e6db74">&amp;#39;./Time&amp;#39;&lt;/span>) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span> &lt;span style="color:#a6e22e">xml_text&lt;/span>() &lt;span style="color:#f92672">%&amp;gt;%&lt;/span> &lt;span style="color:#a6e22e">ymd_hms&lt;/span>(),
velocity &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">xml_find_first&lt;/span>(., &lt;span style="color:#e6db74">&amp;#39;./Extensions/TPX/Speed&amp;#39;&lt;/span>) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span> &lt;span style="color:#a6e22e">xml_double&lt;/span>(),
power &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">xml_find_first&lt;/span>(., &lt;span style="color:#e6db74">&amp;#39;./Extensions/TPX/Watts&amp;#39;&lt;/span>) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span> &lt;span style="color:#a6e22e">xml_integer&lt;/span>(),
bpm &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">xml_find_first&lt;/span>(., &lt;span style="color:#e6db74">&amp;#39;./HeartRateBpm/Value&amp;#39;&lt;/span>) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span> &lt;span style="color:#a6e22e">xml_integer&lt;/span>(),
cadence &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">xml_find_first&lt;/span>(., &lt;span style="color:#e6db74">&amp;#39;./Cadence&amp;#39;&lt;/span>) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span> &lt;span style="color:#a6e22e">xml_integer&lt;/span>(),
lap &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">xml_find_num&lt;/span>(
.,
&lt;span style="color:#e6db74">&amp;#39;count(./parent::Track/parent::Lap/preceding-sibling::Lap)&amp;#39;&lt;/span>
),
)
}
&lt;/code>&lt;/pre>&lt;/div>&lt;table>
&lt;thead>
&lt;tr>
&lt;th align="left">time&lt;/th>
&lt;th align="right">velocity&lt;/th>
&lt;th align="right">power&lt;/th>
&lt;th align="right">bpm&lt;/th>
&lt;th align="right">cadence&lt;/th>
&lt;th align="right">lap&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td align="left">2022-01-16 00:00:42&lt;/td>
&lt;td align="right">3.19&lt;/td>
&lt;td align="right">56&lt;/td>
&lt;td align="right">105&lt;/td>
&lt;td align="right">32&lt;/td>
&lt;td align="right">0&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td align="left">2022-01-16 00:00:43&lt;/td>
&lt;td align="right">3.28&lt;/td>
&lt;td align="right">100&lt;/td>
&lt;td align="right">104&lt;/td>
&lt;td align="right">34&lt;/td>
&lt;td align="right">0&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td align="left">2022-01-16 00:00:44&lt;/td>
&lt;td align="right">3.50&lt;/td>
&lt;td align="right">75&lt;/td>
&lt;td align="right">104&lt;/td>
&lt;td align="right">36&lt;/td>
&lt;td align="right">0&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td align="left">2022-01-16 00:00:45&lt;/td>
&lt;td align="right">3.58&lt;/td>
&lt;td align="right">84&lt;/td>
&lt;td align="right">105&lt;/td>
&lt;td align="right">38&lt;/td>
&lt;td align="right">0&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td align="left">2022-01-16 00:00:46&lt;/td>
&lt;td align="right">3.78&lt;/td>
&lt;td align="right">79&lt;/td>
&lt;td align="right">106&lt;/td>
&lt;td align="right">40&lt;/td>
&lt;td align="right">0&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td align="left">2022-01-16 00:00:47&lt;/td>
&lt;td align="right">4.08&lt;/td>
&lt;td align="right">83&lt;/td>
&lt;td align="right">107&lt;/td>
&lt;td align="right">43&lt;/td>
&lt;td align="right">0&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td align="left">2022-01-16 00:00:48&lt;/td>
&lt;td align="right">4.39&lt;/td>
&lt;td align="right">172&lt;/td>
&lt;td align="right">108&lt;/td>
&lt;td align="right">46&lt;/td>
&lt;td align="right">0&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td align="left">2022-01-16 00:00:49&lt;/td>
&lt;td align="right">4.58&lt;/td>
&lt;td align="right">197&lt;/td>
&lt;td align="right">109&lt;/td>
&lt;td align="right">47&lt;/td>
&lt;td align="right">0&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td align="left">2022-01-16 00:00:50&lt;/td>
&lt;td align="right">4.78&lt;/td>
&lt;td align="right">213&lt;/td>
&lt;td align="right">111&lt;/td>
&lt;td align="right">49&lt;/td>
&lt;td align="right">0&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td align="left">2022-01-16 00:00:51&lt;/td>
&lt;td align="right">5.00&lt;/td>
&lt;td align="right">288&lt;/td>
&lt;td align="right">113&lt;/td>
&lt;td align="right">51&lt;/td>
&lt;td align="right">0&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>While terseness is elegant it can also make the code difficult to interpret, so I think it&amp;rsquo;s valuable to go through each step of the pipeline:&lt;/p>
&lt;ol>
&lt;li>The TCX file is read in as as an &lt;em>xml_document&lt;/em>&lt;/li>
&lt;li>The XML is namespaced, but as we&amp;rsquo;re only working with this file we strip the namespace to make our XPath easier to work with.&lt;/li>
&lt;li>Using the &lt;code>.//Trackpoint[Extensions]&lt;/code> XPath we find all &amp;lsquo;trackpoint&amp;rsquo; nodes that have a child &amp;lsquo;extensions&amp;rsquo; node.
&lt;ul>
&lt;li>We do this because some of the trackpoints only have a timestamp with no data.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>We then construct a data frame (a tibble) by finding and extracting the velocity, power, etc from each trachpoint, with the XPaths being relative to the trackpoint node.
&lt;ul>
&lt;li>The braces to stop the normal behaviour of the left-hand side of the pipe being passed as the first argument to the tibble.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Determining which &amp;lsquo;lap&amp;rsquo; a trackpoint belongs to takes a little more work. We do this by finding it&amp;rsquo;s grandparent lap node and counting how many preceding lap siblings it has. The first lap will have 0 siblings, the second lap 1, and so on.&lt;/li>
&lt;/ol>
&lt;p>That&amp;rsquo;s it! With less than 20 lines of R the XML has been transformed into a tidy, rectangular data format, ready for visualisation and analysis. Speaking of visualisation, let&amp;rsquo;s take a look at a few different aspects of the data to get a general feel for it. The following graph shows the power output over time, each lap being coloured separately. Laps one and three contain the data that will be used in the model.&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/noisy-wind-tunnel/index_files/figure-html/unnamed-chunk-6-1.png" width="672" />&lt;/p>
&lt;p>The data was generated on a track bike which has only a single gear, so the velocity and cadence should have a near perfect linear relationship:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/noisy-wind-tunnel/index_files/figure-html/unnamed-chunk-7-1.png" width="672" />&lt;/p>
&lt;p>There&amp;rsquo;s a clear linear relationship, but there is also distribution of velocities across each cadence value. This is likely due to the difference in precision between the cadence and the velocity, as cadence is measured as a integer whereas velocity is a double with a single decimal point&lt;sup id="fnref:2">&lt;a href="#fn:2" class="footnote-ref" role="doc-noteref">2&lt;/a>&lt;/sup>.&lt;/p>
&lt;p>Before we look at the power and velocity, we need to do a little bit of housework. The second and fourth laps that contain our experimental data are extracted, and a new &lt;em>position&lt;/em> factor variable is created with appropriately named levels.&lt;/p>
&lt;p>In what could be considered controversial, we&amp;rsquo;re going to remove data points where the bike was accelerating - i.e. the rate of change of the power between trackpoint samples was between -10 and 10 watts. Acceleration was required to &amp;lsquo;move&amp;rsquo; to different velocity increments, but our model only relates to points of (relatively) constant velocity. Given our knowledge of the data generation process, I think this data removal can be justified.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">cycle_data_cleaned &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
cycle_data &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">filter&lt;/span>(
lap &lt;span style="color:#f92672">%in%&lt;/span> &lt;span style="color:#a6e22e">c&lt;/span>(&lt;span style="color:#ae81ff">1&lt;/span>,&lt;span style="color:#ae81ff">3&lt;/span>),
&lt;span style="color:#a6e22e">between&lt;/span>(power &lt;span style="color:#f92672">-&lt;/span> &lt;span style="color:#a6e22e">lag&lt;/span>(power), &lt;span style="color:#ae81ff">-10&lt;/span>, &lt;span style="color:#ae81ff">10&lt;/span>)
) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(position &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">fct_recode&lt;/span>(&lt;span style="color:#a6e22e">as_factor&lt;/span>(lap), &lt;span style="color:#e6db74">&amp;#34;Tops&amp;#34;&lt;/span> &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#34;1&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;Drops&amp;#34;&lt;/span> &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#34;3&amp;#34;&lt;/span>))
&lt;/code>&lt;/pre>&lt;/div>&lt;p>We can now view the power output versus the velocity by position of the data we&amp;rsquo;ll be using in our model.&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/noisy-wind-tunnel/index_files/figure-html/unnamed-chunk-9-1.png" width="672" />&lt;/p>
&lt;p>We see an exponential relationship, and can see the &amp;ldquo;blobs&amp;rdquo; of data where I have tried to keep a constant velocity. What is not instantly visible is the difference in power output versus velocity for each of the different hand positions.&lt;/p>
&lt;h1 id="defining-and-building-a-model">Defining and Building a Model&lt;/h1>
&lt;p>We&amp;rsquo;ll be using the the classic drag equation as our model:&lt;/p>
&lt;p>$$ F_d = \frac{1}{2}\rho C_D A v^2$$
This says that the force of drag \(F_d\) on the bike/body system when moving through the air is proportional to half of the density of the fluid (\(\rho\)) times the drag coefficient the bike/body (\(C_D\)) times the front on cross-sectional area (\(A\)) times the square of the velocity (\(v\)). I&amp;rsquo;m going to bundle up all coefficients into a single coefficient \(\beta\).&lt;/p>
&lt;p>$$ \text{Let } \beta = \frac{1}{2} \rho C_D A $$
$$ F_d = \beta v^2 $$
We&amp;rsquo;ve got force on our left-hand side, but we need power. Energy is force times distance, and power is energy over time, so we have:&lt;/p>
&lt;p>$$ F_d \Big( \frac{x}{t} \Big) = \beta v^2 \Big( \frac{x}{t} \Big)$$
Distance over time is velocity so we are left with:&lt;/p>
&lt;p>$$ P_d = \beta v^3 $$
The coefficient is conditional on the position variable, so we&amp;rsquo;ll end up with two coefficients from this model:&lt;/p>
&lt;p>$$ P_d = \Bigg\{\begin{array}{ll}
\beta_{tops} v^3 &amp;amp; \text{if}\ position = tops \\&lt;br>
\beta_{drops} v^3 &amp;amp; \text{if}\ position = drops
\end{array} $$&lt;/p>
&lt;p>Is this a perfect model? Not at all, but for our purposes it should be reasonable. Don&amp;rsquo;t make me tap the &amp;ldquo;all models are wrong&amp;hellip;&amp;rdquo; sign!&lt;/p>
&lt;p>The model will give us an estimate (with some uncertainty) \(\beta_{tops}\) value when I was on the tops of the handlebars, and a \(\beta_{drops}\) value when I was in the drops.&lt;/p>
&lt;p>We have some prior information that we can be included in the model: it takes zero watts to go zero metres per second. This implies that our model should go through the origin \((0,0)\) and we should not include an intercept. I believe that given our strong knowledge of the process that generated the data, removing the intercept is valid.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">cycle_data_mdl &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
cycle_data_cleaned &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">lm&lt;/span>(power &lt;span style="color:#f92672">~&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span> &lt;span style="color:#f92672">+&lt;/span> position&lt;span style="color:#f92672">:&lt;/span>&lt;span style="color:#a6e22e">I&lt;/span>(velocity^3), data &lt;span style="color:#f92672">=&lt;/span> .)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Here&amp;rsquo;s what model looks like overlayed on the data:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/noisy-wind-tunnel/index_files/figure-html/unnamed-chunk-11-1.png" width="672" />&lt;/p>
&lt;p>As expected the drops is more efficient that the tops. Before looking at the parameters of the model let&amp;rsquo;s first look at some diagnostics. The first one to look at is the fitted values of the over the residuals:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/noisy-wind-tunnel/index_files/figure-html/unnamed-chunk-12-1.png" width="672" />&lt;/p>
&lt;p>I&amp;rsquo;ve added a linear regression line to highlight the trend, and it shows shows something quite interesting: there appears to be a linear relationship that our model hasn&amp;rsquo;t accounted for.&lt;/p>
&lt;p>If we think back to our model, we were only accounting for the power required to overcome drag, but there&amp;rsquo;s another force in play that we&amp;rsquo;ve completely ignored: friction. There&amp;rsquo;s the rolling friction of the wheels on the tack, and the sliding friction of the hubs, the chainset and pedals, and of the chain on the sprocket.&lt;/p>
&lt;p>With this realisation, let&amp;rsquo;s try and build a better model to account for this force.&lt;/p>
&lt;h1 id="building-a-better-model">Building a Better Model&lt;/h1>
&lt;p>In the original model, \(P_{Total} = P_{Drag}\), but in our updated model total power used is made up of power to overcome drag plus power to overcome friction:&lt;/p>
&lt;p>$$ P_{t} = P_{d} + P_{f} $$
Once again knowing that forc times distance is energy, and energy over time is power, we end up with:&lt;/p>
&lt;p>$$ P_{f} = \frac{ F_{f} \times x }{ t } = F_{f}v $$&lt;/p>
&lt;p>If we let \(\beta_1 = F_{f}\) then our updated model is:&lt;/p>
&lt;p>$$ P_d = \beta_1 v + \Bigg\{\begin{array}{ll}
\beta_{tops} v^3 &amp;amp; \text{if}\ position = tops \\&lt;br>
\beta_{drops} v^3 &amp;amp; \text{if}\ position = drops
\end{array} $$&lt;/p>
&lt;p>We now run our updated model over the data. The frictional component is not going to be affected by the position on the handlebars, so we ensure it&amp;rsquo;s not conditional on the position:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">cycle_data_mdl &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
cycle_data_cleaned &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">lm&lt;/span>(power &lt;span style="color:#f92672">~&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span> &lt;span style="color:#f92672">+&lt;/span> velocity&lt;span style="color:#f92672">+&lt;/span> position&lt;span style="color:#f92672">:&lt;/span>&lt;span style="color:#a6e22e">I&lt;/span>(velocity^3), data &lt;span style="color:#f92672">=&lt;/span> .)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Here&amp;rsquo;s the updated on model on top of the original data:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/noisy-wind-tunnel/index_files/figure-html/unnamed-chunk-14-1.png" width="672" />&lt;/p>
&lt;p>Hard to discern if much difference from this graph, so we return to the fitted versus residual diagnostic graph:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/noisy-wind-tunnel/index_files/figure-html/unnamed-chunk-15-1.png" width="672" />&lt;/p>
&lt;p>That&amp;rsquo;s looking much better! We&amp;rsquo;ve now captured the linear component, the residuals are random, and the variation is reasonably even across the entire spread of fitted values. There are a few outliers, and a more rigourous analysis would look to determine whether they had significant leverage on our regression line. Subjectively looking at this graph though my guess would be no.&lt;/p>
&lt;p>The other type of diagnostic to look at is a histogram of the residuals. A linear regression has an assumption that the residuals are normal. The residual shape doesn&amp;rsquo;t affect the point estimates of the model, but does affect the confidence intervals of the parameters.&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/noisy-wind-tunnel/index_files/figure-html/unnamed-chunk-16-1.png" width="672" />&lt;/p>
&lt;p>This looks great: the residuals are approximately normal, there&amp;rsquo;s not much mass at outside of 2 standard deviations, and the mean sits approximately at zero.&lt;/p>
&lt;p>With confidence in the model we now take a look at the parameters:&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th align="left">Term&lt;/th>
&lt;th align="right">Estimate&lt;/th>
&lt;th align="right">Std Error&lt;/th>
&lt;th align="right">Statistic&lt;/th>
&lt;th align="right">P Value&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td align="left">velocity&lt;/td>
&lt;td align="right">4.1788613&lt;/td>
&lt;td align="right">0.3782129&lt;/td>
&lt;td align="right">11.04897&lt;/td>
&lt;td align="right">0&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td align="left">positionTops:I(velocity^3)&lt;/td>
&lt;td align="right">0.2131439&lt;/td>
&lt;td align="right">0.0045782&lt;/td>
&lt;td align="right">46.55634&lt;/td>
&lt;td align="right">0&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td align="left">positionDrops:I(velocity^3)&lt;/td>
&lt;td align="right">0.1889915&lt;/td>
&lt;td align="right">0.0044721&lt;/td>
&lt;td align="right">42.26044&lt;/td>
&lt;td align="right">0&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>The velocity term is the \(\beta_1\) coefficient, which is the the frictional force of the bike. The model has determined that the frictional of the bike accounts for 4.18 Newtons of force.&lt;/p>
&lt;p>The next two values the \(\beta_{tops}\) and \(\beta_{drops}\) coefficients. We&amp;rsquo;re not concerned with the specific values (being a combination of the fluid density, drag coefficient, and my cross-sectional area), but what we want to look at is percentage change between these values. The result is that we need to use 11.33% less power to acheive the same velocity in the two different positions. Put another way, we are 11.33% more efficient when positioned in the drops rather than on the tops.&lt;/p>
&lt;p>The following table gives you an idea on the differences in power required for velocities of 20, 40, and 60 km/h.&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th align="right">Velocity&lt;/th>
&lt;th align="right">Tops&lt;/th>
&lt;th align="right">Drops&lt;/th>
&lt;th align="right">Power Difference&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td align="right">20&lt;/td>
&lt;td align="right">59.76&lt;/td>
&lt;td align="right">55.62&lt;/td>
&lt;td align="right">4.14&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td align="right">40&lt;/td>
&lt;td align="right">338.81&lt;/td>
&lt;td align="right">305.68&lt;/td>
&lt;td align="right">33.13&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td align="right">60&lt;/td>
&lt;td align="right">1056.43&lt;/td>
&lt;td align="right">944.61&lt;/td>
&lt;td align="right">111.82&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h1 id="dont-forget-the-uncertainty">Don&amp;rsquo;t Forget the Uncertainty&lt;/h1>
&lt;p>In calculating the &lt;em>average&lt;/em> percent decrease, the uncertainty in the parameters has been thrown away. If we assume two things about the parameters:&lt;/p>
&lt;ol>
&lt;li>The parameter estimates are normally distributed, and&lt;/li>
&lt;li>There is no covariance between the parameters&lt;/li>
&lt;/ol>
&lt;p>then we can take a computational approach to determining the uncertainly of the percentage. Drawing samples from each of the parameter distributions (with a mean of the parameter estimate and a standard deviation of the standard error), we can calculate the percentage for each pair of samples&lt;sup id="fnref:3">&lt;a href="#fn:3" class="footnote-ref" role="doc-noteref">3&lt;/a>&lt;/sup>, giving us a distribution of percentages. The quantiles we want to determine can then be calculated from this data.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># Extract the parameter and standard error from the model.&lt;/span>
beta_tops &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">tidy&lt;/span>(cycle_data_mdl)[[2]][2]
sigma_tops &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">tidy&lt;/span>(cycle_data_mdl)[[3]][2]
beta_drops &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">tidy&lt;/span>(cycle_data_mdl)[[2]][3]
sigma_drops &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">tidy&lt;/span>(cycle_data_mdl)[[3]][3]
&lt;span style="color:#75715e"># Generate our samples and calculate the percentages&lt;/span>
percent_distribution &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
&lt;span style="color:#a6e22e">tibble&lt;/span>(
beta_top_dist &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">rnorm&lt;/span>(&lt;span style="color:#ae81ff">1000000&lt;/span>, beta_tops, sigma_tops),
beta_drop_dist &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">rnorm&lt;/span>(&lt;span style="color:#ae81ff">1000000&lt;/span>, beta_drops, sigma_drops),
percent &lt;span style="color:#f92672">=&lt;/span> ((beta_top_dist &lt;span style="color:#f92672">-&lt;/span> beta_drop_dist) &lt;span style="color:#f92672">/&lt;/span> beta_top_dist) &lt;span style="color:#f92672">*&lt;/span> &lt;span style="color:#ae81ff">100&lt;/span>
)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;img src="https://clt.blog.foletta.net/post/noisy-wind-tunnel/index_files/figure-html/unnamed-chunk-20-1.png" width="672" />&lt;/p>
&lt;p>Our 89%&lt;sup id="fnref:4">&lt;a href="#fn:4" class="footnote-ref" role="doc-noteref">4&lt;/a>&lt;/sup> confidence interval is therefore [6.69%, 15.77%].&lt;/p>
&lt;h1 id="summary">Summary&lt;/h1>
&lt;p>In this article we looked at the aerodynamics of different positions on a bike. We gathered data using different sensors, and showed the elegance of R by transforming XML data into a rectangular, tidy data frame.&lt;/p>
&lt;p>We defined a simple model and used this to perform a regression of power required to maintain a specific velocity. By performing diagnostics on this model, we were able to identify that our model was incomplete, and that we were likely not including friction in the model. We defined and created a new model with friction included, which performed better than our original model.&lt;/p>
&lt;p>The ultimate aim of the article was to determine how much more efficient it is to ride in the &amp;lsquo;drops&amp;rsquo; of the handlebars rather than the &amp;lsquo;tops&amp;rsquo;. From our modelling we found the average estimate of our efficiency gain to be 11.33%, with an 89% confidence interval of [6.69%, 15.77%].&lt;/p>
&lt;section class="footnotes" role="doc-endnotes">
&lt;hr>
&lt;ol>
&lt;li id="fn:1" role="doc-endnote">
&lt;p>Images courtesy of &lt;a href="http://bikegremlin.com">bikegremlin.com&lt;/a> &lt;a href="#fnref:1" class="footnote-backref" role="doc-backlink">&amp;#x21a9;&amp;#xfe0e;&lt;/a>&lt;/p>
&lt;/li>
&lt;li id="fn:2" role="doc-endnote">
&lt;p>A linear regression of cadence on velocity was performed and the residuals were in the range of (-.5, .5). This supports our precision difference hypothesis. &lt;a href="#fnref:2" class="footnote-backref" role="doc-backlink">&amp;#x21a9;&amp;#xfe0e;&lt;/a>&lt;/p>
&lt;/li>
&lt;li id="fn:3" role="doc-endnote">
&lt;p>Thanks to /u/eatthepieguy for responding to my &lt;a href="https://www.reddit.com/r/statistics/comments/sehzun/q_confidence_intervals_for_percentages/">query on this&lt;/a>. &lt;a href="#fnref:3" class="footnote-backref" role="doc-backlink">&amp;#x21a9;&amp;#xfe0e;&lt;/a>&lt;/p>
&lt;/li>
&lt;li id="fn:4" role="doc-endnote">
&lt;p>Why 89%? Well, why 95%? &lt;a href="#fnref:4" class="footnote-backref" role="doc-backlink">&amp;#x21a9;&amp;#xfe0e;&lt;/a>&lt;/p>
&lt;/li>
&lt;/ol>
&lt;/section></description></item><item><title>A Tale Of Two Optimisations</title><link>https://clt.blog.foletta.net/post/a-tale-of-two-optimisations/</link><pubDate>Sun, 10 Oct 2021 00:00:00 +0000</pubDate><guid>https://clt.blog.foletta.net/post/a-tale-of-two-optimisations/</guid><description>&lt;p>A couple of months ago I wrote a toy program called &lt;a href="https://github.com/gregfoletta/whitespacer">whitespacer&lt;/a>. Ever since, I&amp;rsquo;ve had this gnawing feeling that I could have done it better; that it could have been written in a more performant manner. In this article I&amp;rsquo;ll take you through a couple of different ideas I came up with. We&amp;rsquo;ll profile and visualise their different performances and stumble upon some surprising behaviour. We&amp;rsquo;ll use this as an excuse to take a deeper dive into their behaviour at the CPU level and gain a better understanding about the &lt;code>perf&lt;/code> performance analysis tool and branch prediction.&lt;/p>
&lt;h1 id="a-quick-recap">A Quick Recap&lt;/h1>
&lt;p>In &lt;a href="https://clt.blog.foletta.net/post/2021-06-21-whitespacer/">a previous article&lt;/a> I took you through the &lt;em>whitespacer&lt;/em> program. Here&amp;rsquo;s a quick recap of what it does:&lt;/p>
&lt;ul>
&lt;li>Reads from standard in and writes to standard out.&lt;/li>
&lt;li>Works in an &amp;lsquo;encoding&amp;rsquo; and &amp;lsquo;decoding&amp;rsquo; mode&amp;rsquo;.&lt;/li>
&lt;li>In encoding mode it takes each of the four dibits (2 bits) of each byte and turns it into one of four whitespace characters (tab, new line, carriage return and space).&lt;/li>
&lt;li>Decoding mode does the reverse, taking groups of four whitespace characters and reconstituting the original byte.&lt;/li>
&lt;/ul>
&lt;p>Here&amp;rsquo;s the encoding function:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-C" data-lang="C">&lt;span style="color:#75715e">//Dibit to whitesapce lookup table (global variable)
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">char&lt;/span> encode_lookup_tbl[] &lt;span style="color:#f92672">=&lt;/span> { &lt;span style="color:#e6db74">&amp;#39;\t&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;\n&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;\r&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39; &amp;#39;&lt;/span> };
&lt;span style="color:#75715e">//Given a dibit, returns the whitespace encoding
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">unsigned&lt;/span> &lt;span style="color:#66d9ef">char&lt;/span> &lt;span style="color:#a6e22e">lookup_encode&lt;/span>(&lt;span style="color:#66d9ef">const&lt;/span> &lt;span style="color:#66d9ef">unsigned&lt;/span> &lt;span style="color:#66d9ef">char&lt;/span> dibit) {
&lt;span style="color:#66d9ef">return&lt;/span> encode_lookup_tbl[ dibit ];
}
&lt;/code>&lt;/pre>&lt;/div>&lt;p>I&amp;rsquo;ve omitted the decoding function for brevity, but it&amp;rsquo;s the same with a different lookup table.&lt;/p>
&lt;h1 id="attempt-1-a-mathematical-function">Attempt 1: A Mathematical Function&lt;/h1>
&lt;p>What bothered me about the original implementation was the lookup table. Even though I knew they&amp;rsquo;d be cached, I still thought the memory accesses might have a detrimental affect on performance.&lt;/p>
&lt;p>I had an idea about using mathematical functions (rather than the lookup table) to perform the encoding/decoding. This would remove the memory accesses and perhaps improve performance.&lt;/p>
&lt;p>From somewhere I recalled that if we have a set of \(k + 1\) data points \((x_0, y_0),&amp;hellip;, (x_k, y_k)\) where no two \(x_i\) are the same, we can fit a curve using a linear regression with a polynomial of degree \(k\). Here&amp;rsquo;s our table of data points:&lt;/p>
&lt;pre>&lt;code># A tibble: 4 × 2
dibit ascii_dec
&amp;lt;int&amp;gt; &amp;lt;int&amp;gt;
1 0 9
2 1 10
3 2 13
4 3 32
&lt;/code>&lt;/pre>&lt;p>This means we can find \(\beta\) coefficients for the function
$$ f(x) = \beta_0 + \beta_1 x + \beta_2 x^2 + \beta_3 x^3 $$
such that when passed a dibit value it returns the appropriate whitespace character. We can also find the inverse polynomial \(f^{-1}(x)\) which takes a whitespace character and returns a dibit to be our decoding function. Let&amp;rsquo;s create linear regression models in R where &lt;code>whitespace&lt;/code> is the table holding our dibit and whitespace values:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">encode_model &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
whitespace &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">lm&lt;/span>(ascii_dec &lt;span style="color:#f92672">~&lt;/span> dibit &lt;span style="color:#f92672">+&lt;/span> &lt;span style="color:#a6e22e">I&lt;/span>(dibit^2) &lt;span style="color:#f92672">+&lt;/span> &lt;span style="color:#a6e22e">I&lt;/span>(dibit^3), data &lt;span style="color:#f92672">=&lt;/span> .)
decode_model &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
whitespace &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">lm&lt;/span>(dibit &lt;span style="color:#f92672">~&lt;/span> ascii_dec &lt;span style="color:#f92672">+&lt;/span> &lt;span style="color:#a6e22e">I&lt;/span>(ascii_dec^2) &lt;span style="color:#f92672">+&lt;/span> &lt;span style="color:#a6e22e">I&lt;/span>(ascii_dec^3), data &lt;span style="color:#f92672">=&lt;/span> .)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Visualising these models will help us see what&amp;rsquo;s going on. On the left is our encoding model, which takes out dibit values and maps them to our ASCII whitespace characters. On the right our decoding model, which takes the ASCII whitespace characters&amp;rsquo; decimal value and maps it back to a dibit. What becomes obvious by visualising these is how they make no sense for any values outside of our four points:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/a-tale-of-two-optimisations/index_files/figure-html/unnamed-chunk-4-1.png" width="672" />&lt;/p>
&lt;p>Let&amp;rsquo;s take a look at the \(\beta\) coefficients:&lt;/p>
&lt;pre>&lt;code># A tibble: 4 × 3
parameter encode decode
&amp;lt;chr&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt;
1 beta_0 9.00 -31.8
2 beta_1 4.67 6.42
3 beta_2 -6.00 -0.381
4 beta_3 2.33 0.00669
&lt;/code>&lt;/pre>&lt;p>Immediately we see a bit of a problem. I&amp;rsquo;ll be honest that I was hoping - somewhat optimistically - for some nice, clean integer coefficients. Instead we&amp;rsquo;ve got floating point values. My gut feel is that having to use floating point instructions is not going to improve upon our original lookup table. But we won&amp;rsquo;t know until we profile, so let&amp;rsquo;s persist. Here&amp;rsquo;s the new polynomial encoding function, with the decoding function omitted for brevity:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-C" data-lang="C">&lt;span style="color:#66d9ef">unsigned&lt;/span> &lt;span style="color:#66d9ef">char&lt;/span> &lt;span style="color:#a6e22e">poly_encode&lt;/span>(&lt;span style="color:#66d9ef">const&lt;/span> &lt;span style="color:#66d9ef">unsigned&lt;/span> &lt;span style="color:#66d9ef">char&lt;/span> dibit) {
&lt;span style="color:#66d9ef">return&lt;/span> (&lt;span style="color:#66d9ef">unsigned&lt;/span> &lt;span style="color:#66d9ef">char&lt;/span>) (&lt;span style="color:#ae81ff">9.0&lt;/span> &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#ae81ff">4.666667&lt;/span> &lt;span style="color:#f92672">*&lt;/span> dibit &lt;span style="color:#f92672">-&lt;/span>
&lt;span style="color:#ae81ff">6.0&lt;/span> &lt;span style="color:#f92672">*&lt;/span> (dibit &lt;span style="color:#f92672">*&lt;/span> dibit) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#ae81ff">2.333333333333&lt;/span> &lt;span style="color:#f92672">*&lt;/span> (dibit &lt;span style="color:#f92672">*&lt;/span> dibit &lt;span style="color:#f92672">*&lt;/span> dibit));
}
&lt;/code>&lt;/pre>&lt;/div>&lt;p>We don&amp;rsquo;t need to (and are probably unlikely to) hit the mark exactly, we just need to get close enough so that the whole part of the floating point value is correct. The cast to &lt;code>usigned char&lt;/code> will give us this whole part, throwing away any values after the decimal point.&lt;/p>
&lt;h1 id="attempt-2-a-switch">Attempt 2: A Switch&lt;/h1>
&lt;p>While working on the polynomial function, it struck me that we could also use a conditional statement to make decisions on how to encode (and decode) individual dibits. Here&amp;rsquo;s an implementation which uses a switch statement:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-C" data-lang="C">&lt;span style="color:#66d9ef">unsigned&lt;/span> &lt;span style="color:#66d9ef">char&lt;/span> &lt;span style="color:#a6e22e">switch_encode&lt;/span>(&lt;span style="color:#66d9ef">const&lt;/span> &lt;span style="color:#66d9ef">unsigned&lt;/span> &lt;span style="color:#66d9ef">char&lt;/span> dibit) {
&lt;span style="color:#66d9ef">switch&lt;/span> (dibit) {
&lt;span style="color:#66d9ef">case&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span>&lt;span style="color:#f92672">:&lt;/span>
&lt;span style="color:#66d9ef">return&lt;/span> &lt;span style="color:#e6db74">&amp;#39;\t&amp;#39;&lt;/span>;
&lt;span style="color:#66d9ef">case&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>&lt;span style="color:#f92672">:&lt;/span>
&lt;span style="color:#66d9ef">return&lt;/span> &lt;span style="color:#e6db74">&amp;#39;\n&amp;#39;&lt;/span>;
&lt;span style="color:#66d9ef">case&lt;/span> &lt;span style="color:#ae81ff">2&lt;/span>&lt;span style="color:#f92672">:&lt;/span>
&lt;span style="color:#66d9ef">return&lt;/span> &lt;span style="color:#e6db74">&amp;#39;\r&amp;#39;&lt;/span>;
&lt;span style="color:#66d9ef">case&lt;/span> &lt;span style="color:#ae81ff">3&lt;/span>&lt;span style="color:#f92672">:&lt;/span>
&lt;span style="color:#66d9ef">return&lt;/span> &lt;span style="color:#e6db74">&amp;#39; &amp;#39;&lt;/span>;
}
}
&lt;/code>&lt;/pre>&lt;/div>&lt;p>We also need need a way to select the algorithm the program uses at runtime. A command line option &lt;em>-a &lt;algorithm>&lt;/em> has been added, where &lt;algorithm> is either &amp;lsquo;lookup&amp;rsquo;, &amp;lsquo;poly&amp;rsquo; or &amp;lsquo;switch&amp;rsquo;. If none is specified it defaults to the original lookup table. You can find the full code for the new, multi-algorithm whitespacer &lt;a href="https://github.com/gregfoletta/whitespacer/tree/algorithms">here&lt;/a>.&lt;/p>
&lt;h1 id="profiling">Profiling&lt;/h1>
&lt;p>Rather than supposition, let&amp;rsquo;s test how the different algorithms perform. I&amp;rsquo;ve created an R function which takes a vector of shell commands and returns how long they took to run. There will be a small amount of overhead in spawning a shell, but as this is constant across all executions and as we&amp;rsquo;re looking at he &lt;em>differences&lt;/em> between the runtimes, it gets cancelled out.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">system_profile &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">function&lt;/span>(commands) {
&lt;span style="color:#a6e22e">map_dbl&lt;/span>(commands, &lt;span style="color:#f92672">~&lt;/span>{
start &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">proc.time&lt;/span>()[&lt;span style="color:#e6db74">&amp;#34;elapsed&amp;#34;&lt;/span>]
&lt;span style="color:#a6e22e">system&lt;/span>(.x, ignore.stdout &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#66d9ef">TRUE&lt;/span>)
finish &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">proc.time&lt;/span>()[&lt;span style="color:#e6db74">&amp;#34;elapsed&amp;#34;&lt;/span>]
finish &lt;span style="color:#f92672">-&lt;/span> start
})
}
&lt;/code>&lt;/pre>&lt;/div>&lt;p>We now run 100 iterations of an encode / decode pipeline for each algorithm and look at the time each takes to run, piping in a 32Mb file of random bytes generated from &lt;em>/dev/urandom&lt;/em>. The output is dumped to &lt;em>/dev/null&lt;/em>. The executables have been compiled with all optimisations disabled.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">profiling_results &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
&lt;span style="color:#a6e22e">tibble&lt;/span>(
n &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>&lt;span style="color:#f92672">:&lt;/span>&lt;span style="color:#ae81ff">300&lt;/span>,
algo &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">rep&lt;/span>(&lt;span style="color:#a6e22e">c&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;lookup&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;poly&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;switch&amp;#39;&lt;/span>), &lt;span style="color:#a6e22e">max&lt;/span>(n) &lt;span style="color:#f92672">/&lt;/span> &lt;span style="color:#ae81ff">3&lt;/span>)
) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(
command &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">glue&lt;/span>(
&lt;span style="color:#e6db74">&amp;#39;./ws_debug -a { algo } &amp;lt; urandom_32M | ./ws_debug -d -a { algo } &amp;gt; /dev/null&amp;#39;&lt;/span>
),
time &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">system_profile&lt;/span>(command)
) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">select&lt;/span>(&lt;span style="color:#f92672">-&lt;/span>command)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Rather than simply looking at the means or medians for each of the algorithms, we take a look at the distribution of runtimes for each with the mean highlighted.&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/a-tale-of-two-optimisations/index_files/figure-html/unnamed-chunk-8-1.png" width="672" />&lt;/p>
&lt;p>As expected, the polynomial encoding/decoding is slower than the lookup table. But what is really surprising is the switch statement: its slower than both! On average it&amp;rsquo;s 2.45 seconds slower than the lookup table! I love a good surprise, so let&amp;rsquo;s dive in and try to work out what&amp;rsquo;s happening.&lt;/p>
&lt;h1 id="whats-with-the-switch">What&amp;rsquo;s With The Switch?&lt;/h1>
&lt;p>We&amp;rsquo;ll take a bottom up approach and look at the instructions that were being executed on the CPU as the process was running. Using &lt;code>perf record&lt;/code> we take samples of the process&amp;rsquo;s state, most importantly the instruction pointer. By using the &lt;code>-b&lt;/code> switch, perf also captures the &lt;em>Last Branch Record (LBR) stack&lt;/em>. With the LBR processor feature, the CPU logs the &lt;em>from&lt;/em> and &lt;em>to&lt;/em> addresses of predicted and mispredicted branches taken to a set of special purpose registers. With this information, &lt;em>perf&lt;/em> can reconstitute the a history of the instructions executed, rather than only having a single point in time to use from the sample.&lt;/p>
&lt;p>Let&amp;rsquo;s run the encoding half of the pipeline and feed it 32Mb of random bytes:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sh" data-lang="sh">perf record -b -o switch.data -e cycles:pp ./ws_debug -a switch &amp;lt; urandom_32M &amp;gt; /dev/null
&lt;span style="color:#f92672">[&lt;/span> perf record: Woken up &lt;span style="color:#ae81ff">22&lt;/span> times to write data &lt;span style="color:#f92672">]&lt;/span>
&lt;span style="color:#f92672">[&lt;/span> perf record: Captured and wrote 5.345 MB switch.data &lt;span style="color:#f92672">(&lt;/span>&lt;span style="color:#ae81ff">6833&lt;/span> samples&lt;span style="color:#f92672">)&lt;/span> &lt;span style="color:#f92672">]&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Looking at the trace data (saved in &lt;em>switch.data&lt;/em>), the &lt;em>brstackins&lt;/em> field allows us to see the instructions executed and approximate CPU cycles for different branches. Perf has captured around 43,000 executions of our &lt;code>switch_encode()&lt;/code> function, with just one of these displayed below:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sh" data-lang="sh">perf script -F +brstackinsn -i switch.data
&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-asm" data-lang="asm">switch_encode:
&lt;span style="color:#960050;background-color:#1e0010">0000560&lt;/span>&lt;span style="color:#a6e22e">dd2bffddd&lt;/span> &lt;span style="color:#66d9ef">insn&lt;/span>: &lt;span style="color:#ae81ff">55&lt;/span>
&lt;span style="color:#ae81ff">0000560&lt;/span>&lt;span style="color:#66d9ef">dd2bffdde&lt;/span> &lt;span style="color:#66d9ef">insn&lt;/span>: &lt;span style="color:#ae81ff">48&lt;/span> &lt;span style="color:#ae81ff">89&lt;/span> &lt;span style="color:#66d9ef">e5&lt;/span>
&lt;span style="color:#ae81ff">0000560&lt;/span>&lt;span style="color:#66d9ef">dd2bffde1&lt;/span> &lt;span style="color:#66d9ef">insn&lt;/span>: &lt;span style="color:#ae81ff">89&lt;/span> &lt;span style="color:#66d9ef">f8&lt;/span>
&lt;span style="color:#ae81ff">0000560&lt;/span>&lt;span style="color:#66d9ef">dd2bffde3&lt;/span> &lt;span style="color:#66d9ef">insn&lt;/span>: &lt;span style="color:#ae81ff">88&lt;/span> &lt;span style="color:#ae81ff">45&lt;/span> &lt;span style="color:#66d9ef">fc&lt;/span>
&lt;span style="color:#ae81ff">0000560&lt;/span>&lt;span style="color:#66d9ef">dd2bffde6&lt;/span> &lt;span style="color:#66d9ef">insn&lt;/span>: &lt;span style="color:#ae81ff">0&lt;/span>&lt;span style="color:#66d9ef">f&lt;/span> &lt;span style="color:#66d9ef">b6&lt;/span> &lt;span style="color:#ae81ff">45&lt;/span> &lt;span style="color:#66d9ef">fc&lt;/span>
&lt;span style="color:#ae81ff">0000560&lt;/span>&lt;span style="color:#66d9ef">dd2bffdea&lt;/span> &lt;span style="color:#66d9ef">insn&lt;/span>: &lt;span style="color:#ae81ff">83&lt;/span> &lt;span style="color:#66d9ef">f8&lt;/span> &lt;span style="color:#ae81ff">01&lt;/span>
&lt;span style="color:#ae81ff">0000560&lt;/span>&lt;span style="color:#66d9ef">dd2bffded&lt;/span> &lt;span style="color:#66d9ef">insn&lt;/span>: &lt;span style="color:#ae81ff">74&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>&lt;span style="color:#66d9ef">e&lt;/span> &lt;span style="color:#75715e"># MISPRED 4 cycles 1.50 IPC
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#ae81ff">0000560&lt;/span>&lt;span style="color:#66d9ef">dd2bffe0d&lt;/span> &lt;span style="color:#66d9ef">insn&lt;/span>: &lt;span style="color:#66d9ef">b8&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span>&lt;span style="color:#66d9ef">a&lt;/span> &lt;span style="color:#ae81ff">00&lt;/span> &lt;span style="color:#ae81ff">00&lt;/span> &lt;span style="color:#ae81ff">00&lt;/span>
&lt;span style="color:#ae81ff">0000560&lt;/span>&lt;span style="color:#66d9ef">dd2bffe12&lt;/span> &lt;span style="color:#66d9ef">insn&lt;/span>: &lt;span style="color:#66d9ef">eb&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span>&lt;span style="color:#66d9ef">e&lt;/span> &lt;span style="color:#75715e"># PRED 22 cycles 0.05 IPC
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#ae81ff">0000560&lt;/span>&lt;span style="color:#66d9ef">dd2bffe22&lt;/span> &lt;span style="color:#66d9ef">insn&lt;/span>: &lt;span style="color:#ae81ff">5&lt;/span>&lt;span style="color:#66d9ef">d&lt;/span>
&lt;span style="color:#ae81ff">0000560&lt;/span>&lt;span style="color:#66d9ef">dd2bffe23&lt;/span> &lt;span style="color:#66d9ef">insn&lt;/span>: &lt;span style="color:#66d9ef">c3&lt;/span> &lt;span style="color:#75715e"># PRED 5 cycles 0.20 IPC
&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Even without decoding the machine code, what immediately stands stand out is the 22 cycles taken following the branch misprediction. Let&amp;rsquo;s now zoom out and run &lt;code>perf stat&lt;/code> to pull out some summary statistics:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sh" data-lang="sh">&lt;span style="color:#75715e"># Perf stat with 32Mb urandom bytes&lt;/span>
perf stat ./ws_debug -a switch &amp;lt; urandom_32M &amp;gt; /dev/null
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>
Performance counter stats for './ws_debug -a switch':
1743.865268 task-clock (msec) # 1.000 CPUs utilized
7 context-switches # 0.004 K/sec
0 cpu-migrations # 0.000 K/sec
12,852 page-faults # 0.007 M/sec
6,484,859,596 cycles # 3.719 GHz
5,773,682,725 instructions # 0.89 insn per cycle
967,903,800 branches # 555.034 M/sec
123,207,223 branch-misses # 12.73% of all branches
1.744494500 seconds time elapsed
&lt;/code>&lt;/pre>&lt;p>The first statistic that stands out are the instructions per cycle, which is just under one. The likely cause of this is the number is the number of branch misses, which is up at ~12%. These branch prediction misses are destroying any benefit of pipelining in the CPU.&lt;/p>
&lt;p>I&amp;rsquo;ve got an sneaking suspicion that the cause of these missed branch predictions is our input data, so let&amp;rsquo;s test the hypothesis. We create a 32Mb file of all zeros from &lt;em>/dev/zero&lt;/em> instead of random bytes, then re-profile with this data as input instead of the random bytes. I&amp;rsquo;ll omit the code and go straight to the results:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/a-tale-of-two-optimisations/index_files/figure-html/zero_profiling-1.png" width="672" />&lt;/p>
&lt;p>The switch algorithm now outperforms the others! Here&amp;rsquo;s the &lt;em>perf stat&lt;/em>:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">&lt;span style="color:#75715e"># Perf stat with 32Mb of zero bytes&lt;/span>
perf stat ./ws_debug -a switch &amp;lt; zero_32M &amp;gt; /dev/null
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>
Performance counter stats for './ws_debug -a switch':
543.503058 task-clock (msec) # 0.999 CPUs utilized
8 context-switches # 0.015 K/sec
0 cpu-migrations # 0.000 K/sec
12,854 page-faults # 0.024 M/sec
1,953,309,524 cycles # 3.594 GHz
5,835,245,013 instructions # 2.99 insn per cycle
999,489,826 branches # 1838.977 M/sec
1,226,487 branch-misses # 0.12% of all branches
0.544021460 seconds time elapsed
&lt;/code>&lt;/pre>&lt;p>In the grand scheme of things, almost no branch prediction misses, and a huge speed increase as compared to the random bytes.&lt;/p>
&lt;p>Now this is where I start to butt up against the limits of my CPU architecture knowledge (feel free to &lt;a href="mailto:greg@foletta.org">contact me&lt;/a> with any corrections), but what I assume is happening is that the branch predictor on my CPU is using historical branching information to try and make good guesses. But when the input is random bytes, history provides no additional information, and the branch predictor adds no benefit. In fact it may be a hindrance due to the penalty of a missed branches.&lt;/p>
&lt;h1 id="lookup-versus-polynomial">Lookup versus Polynomial&lt;/h1>
&lt;p>Now let&amp;rsquo;s take a look at the unsurprising result: our original lookup algorithm versus the polynomial. We&amp;rsquo;ll go straight to using &lt;code>perf stat&lt;/code> to see high-level statistics. First up is the lookup algorithm:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">&lt;span style="color:#75715e"># Lookup table&lt;/span>
perf stat ./ws_debug -a lookup &amp;lt; urandom_32M &amp;gt; /dev/null
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>
Performance counter stats for './ws_debug -a lookup':
511.025199 task-clock (msec) # 0.999 CPUs utilized
7 context-switches # 0.014 K/sec
1 cpu-migrations # 0.002 K/sec
12,852 page-faults # 0.025 M/sec
1,802,617,592 cycles # 3.527 GHz
5,195,307,808 instructions # 2.88 insn per cycle
487,513,834 branches # 953.992 M/sec
506,448 branch-misses # 0.10% of all branches
0.511367875 seconds time elapsed
&lt;/code>&lt;/pre>&lt;p>Now the polynomial:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">&lt;span style="color:#75715e"># Polynomial&lt;/span>
perf stat ./ws_debug -a poly &amp;lt; urandom_32M &amp;gt; /dev/null
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>
Performance counter stats for './ws_debug -a poly':
1063.839451 task-clock (msec) # 1.000 CPUs utilized
2 context-switches # 0.002 K/sec
0 cpu-migrations # 0.000 K/sec
12,850 page-faults # 0.012 M/sec
3,829,284,007 cycles # 3.599 GHz
7,755,413,038 instructions # 2.03 insn per cycle
487,472,058 branches # 458.220 M/sec
537,512 branch-misses # 0.11% of all branches
1.064266885 seconds time elapsed
&lt;/code>&lt;/pre>&lt;p>We see two main reasons as to why the polynomial is slower. First off it simply takes more instructions to calculate the polynomial as opposed to looking up the value in a lookup table. Second, even with the (likely cached) memory accesses, we&amp;rsquo;re able to execute more instructions per cycle with the lookup table as opposed to the polynomial algorithm.&lt;/p>
&lt;h1 id="summary">Summary&lt;/h1>
&lt;p>In this article we looked at two different alternatives to a lookup table for encoding and decoding bytes in the &lt;em>whitespacer&lt;/em> program. The fist was to try and use a polynomial function, and the second was to use a conditional switch statement.&lt;/p>
&lt;p>We found a surprising result with the switch statement, and were able to determine using the &lt;em>perf&lt;/em> tool that we were paying a penalty for missed branch predictions due to the randomness of the input data.&lt;/p>
&lt;p>When you run into a problem or a surprising result, you&amp;rsquo;re often gifted an opportunity to learn something new. I&amp;rsquo;ve certainly learned a lot about &lt;em>perf&lt;/em>, LBR stacks and branch prediction in putting this article together.&lt;/p></description></item><item><title>Bandwidth Seasonal Decomposition</title><link>https://clt.blog.foletta.net/post/2021-07-05-bandwidth-seasonal-decomposition/</link><pubDate>Wed, 11 Aug 2021 00:00:00 +0000</pubDate><guid>https://clt.blog.foletta.net/post/2021-07-05-bandwidth-seasonal-decomposition/</guid><description>&lt;p>Over the past few months I&amp;rsquo;ve been studying time series data and modelling using Rob Hyndman&amp;rsquo;s fantastic &lt;a href="https://otexts.com/fpp3">Forecasting: Principles and Practice&lt;/a> textbook. My area of expertise is in networking, and a significant amount of operational the data that we deal with fits into the category of time series data. It could be throughput through a router, sessions on a firewall, or the counts of HTTP response codes from a content delivery network.&lt;/p>
&lt;p>In this article we&amp;rsquo;re going to look at applying time series concepts and models to the bandwidth utilisation - both ingress and egress - of a network router. Using historical data, we&amp;rsquo;ll see how we can decompose this data into separate components, and then use these components to forecast the bandwidth utilisation in the future.&lt;/p>
&lt;h1 id="caveats">Caveats&lt;/h1>
&lt;p>A couple of things to note before we start. In a real world situation we&amp;rsquo;d likely try a few different models, tune them with different parameters, and evaluate them using cross validation or bootstrapping. We&amp;rsquo;re going to keep things simple in this article by using a single, parsimonious model with one set of parameters to forecast our network throughput, and a simple training/test set split for validating our model&amp;rsquo;s performance.&lt;/p>
&lt;p>We&amp;rsquo;re also going to skip over the underlying mathematics of our methods, focusing on their practical use. Leaving this out is a hard decision for me, as I dislike using statistical methods without a decent understanding of how things are working under the hood. However this is an article, not a dissertation, so the underlying mathematics will be left for another day.&lt;/p>
&lt;h1 id="a-quick-primer">A Quick Primer&lt;/h1>
&lt;p>Time series data focuses on a single variable that is observed multiple times over a period of time. Contrast this against cross-sectional data, which focuses on multiple variables observed at the same point in time.&lt;/p>
&lt;p>Certail kinds of time series data exhibit patterns which make it possible to split or &amp;lsquo;decompose&amp;rsquo; it into different components:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Trend&lt;/strong>: the long term increase or decrease in the data.&lt;/li>
&lt;li>&lt;strong>Cycle&lt;/strong>: these are rises and falls in the data that are not a fixed period.&lt;/li>
&lt;li>&lt;strong>Seasonal&lt;/strong>: a pattern that occurs due to seasonal factors such as the time of the day. It&amp;rsquo;s a fixed and known period.&lt;/li>
&lt;li>&lt;strong>Remainder&lt;/strong>: what&amp;rsquo;s left after the trend, cycle, and seasonal components are removed.&lt;/li>
&lt;/ul>
&lt;p>Often the trend and cycle components are combined into a single component called the &lt;em>trend-cycle&lt;/em>.&lt;/p>
&lt;h1 id="the-data">The Data&lt;/h1>
&lt;p>Let&amp;rsquo;s get an understanding of the data and perform some diagnostics on it. The data we&amp;rsquo;re using is contained in a time series table or &lt;a href="https://github.com/tidyverts/tsibble">tsibble&lt;/a> called &lt;code>throughput&lt;/code>. It consists of 1440 observations of the ingress and egress throughput through a router over 30 days. Each observation is the average throughput through the router over a 30 minute interval.&lt;/p>
&lt;p>As we&amp;rsquo;re going to be forecasting, we immediately split out data into a training set and a test set. The training set will contain the first 23 days of data, and the test set will contain the last 7. We&amp;rsquo;ll perform discovery and train our model on the training set, leaving the test set to ascertain the accuracy of our model.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># Training set&lt;/span>
throughput_train &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
throughput &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">filter_index&lt;/span>(. &lt;span style="color:#f92672">~&lt;/span> &lt;span style="color:#e6db74">&amp;#39;2021-06-24&amp;#39;&lt;/span>)
&lt;span style="color:#75715e"># Test set&lt;/span>
throughput_test &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
throughput &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">filter_index&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;2021-06-25&amp;#39;&lt;/span> &lt;span style="color:#f92672">~&lt;/span> &lt;span style="color:#e6db74">&amp;#39;2021-07-01&amp;#39;&lt;/span>)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Let&amp;rsquo;s take a look at the training data.&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2021-07-05-bandwidth-seasonal-decomposition/index_files/figure-html/unnamed-chunk-4-1.png" width="672" />&lt;/p>
&lt;p>We see a very clear pattern with a daily &amp;ldquo;seasonal period&amp;rdquo; for both the ingress and egress directions. As expected, there is more ingress data than egress data. We can use a &lt;em>seasonal plot&lt;/em> to get a better view of the seasonality. This chart plots each day over the top of each other, giving us a view of the traffic profile throughout each hour of the day.&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2021-07-05-bandwidth-seasonal-decomposition/index_files/figure-html/unnamed-chunk-5-1.png" width="672" />&lt;/p>
&lt;p>For both ingress and egress directions we see throughput dropping overnight and reaching a minimum around 4 am in the morning. It rises slowly at first, increasing at 9am when everyone starts work. It continues to rise throughput the day, peaking at around 9pm at night.&lt;/p>
&lt;h1 id="modelling--decomposition">Modelling &amp;amp; Decomposition&lt;/h1>
&lt;p>We&amp;rsquo;re going to use &amp;lsquo;Seasonal and Trend decomposition using LOESS&amp;rsquo; (STL) to decompose our time series, where LOESS is &amp;lsquo;LOcally Estimated Scatter point Smoothing&amp;rsquo; (acronym inception!). We&amp;rsquo;ll use this to additively decompose our time series at time &lt;em>t&lt;/em> into trend, seasonal, and remainder components, written as:&lt;/p>
&lt;p>$$ y_t = T_t + S_t + R_t $$
Given our domain expertise and our initial view of the data, we&amp;rsquo;re confident that the seasonal period is one day. But it would be nice to put a quantitative number around that. We can use the &lt;code>feat_stl()&lt;/code> function to pull out some STL specific features. As our data is measured every 30 minutes, a period of 48 is one day.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">throughput_train &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">features&lt;/span>(Mbps, &lt;span style="color:#a6e22e">list&lt;/span>(&lt;span style="color:#f92672">~&lt;/span>{ &lt;span style="color:#a6e22e">feat_stl&lt;/span>(.x, .period &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">48&lt;/span>) })) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">select&lt;/span>(
direction, trend_strength, &lt;span style="color:#a6e22e">starts_with&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;seasonal_strength&amp;#39;&lt;/span>)
)
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code># A tibble: 2 × 3
direction trend_strength seasonal_strength_48
&amp;lt;chr&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt;
1 egress 0.659 0.934
2 ingress 0.583 0.980
&lt;/code>&lt;/pre>&lt;p>Both the &lt;em>trend strength&lt;/em> and &lt;em>seasonal strength&lt;/em> are statistics between 0 and 1, giving a measure of the strength of the components that the STL decomposition has extracted. We see a reasonable trend, but a very large seasonal strength, adding to the evidence of a daily seasonal pattern. We can now define our STL model, run it over our data, and extract out the components.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># Define out STL model&lt;/span>
STL_tp &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">STL&lt;/span>(
Mbps &lt;span style="color:#f92672">~&lt;/span> &lt;span style="color:#a6e22e">trend&lt;/span>() &lt;span style="color:#f92672">+&lt;/span> &lt;span style="color:#a6e22e">season&lt;/span>(period &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;1 day&amp;#39;&lt;/span>),
robust &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#66d9ef">TRUE&lt;/span>
)
&lt;span style="color:#75715e"># Run an STL model across out data&lt;/span>
tp_stl_mdl &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
throughput_train &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">model&lt;/span>(STL &lt;span style="color:#f92672">=&lt;/span> STL_tp)
&lt;span style="color:#75715e"># Extract out the components&lt;/span>
tp_stl_decomp &lt;span style="color:#f92672">&amp;lt;-&lt;/span> tp_stl_mdl &lt;span style="color:#f92672">%&amp;gt;%&lt;/span> &lt;span style="color:#a6e22e">components&lt;/span>()
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Let&amp;rsquo;s take a look at each of the components in both the ingress and egress directions.&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2021-07-05-bandwidth-seasonal-decomposition/index_files/figure-html/unnamed-chunk-8-1.png" width="672" />&lt;/p>
&lt;p>At a first glance the STL decomposition has done well. The trend-cycle looks pretty flat, which is to be expected given the relatively short time frame of the data. The ebbs and flows of the trend-cycle could be cyclic, or could be some longer term seasonality such as a weekly seasonality that we haven&amp;rsquo;t captured.&lt;/p>
&lt;p>You may have noticed that there are negative values for both the seasonal and remainder component. As we&amp;rsquo;re using an additive model, these values are relative to the trend component at each point in point in time.&lt;/p>
&lt;p>The decomposition also gives us the &lt;em>seasonally adjusted&lt;/em> series. This is the series with the seasonal component removed.&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2021-07-05-bandwidth-seasonal-decomposition/index_files/figure-html/unnamed-chunk-9-1.png" width="672" />&lt;/p>
&lt;p>The seasonally adjusted series is important when we want to use this decomposition to forecast future values.&lt;/p>
&lt;p>The remainder is relatively small, which is a good sign as it implies that we&amp;rsquo;ve pulled out most of the &amp;lsquo;signal&amp;rsquo; in the trend-cycle and seasonal components. Let&amp;rsquo;s focus in on the remainder, also known as the residual.&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2021-07-05-bandwidth-seasonal-decomposition/index_files/figure-html/unnamed-chunk-10-1.png" width="672" />&lt;/p>
&lt;p>Looking at the line graph of the residuals, there may be some seasonality we haven&amp;rsquo;t completely captured, but that&amp;rsquo;s not surprising given the simplicity of the model. The residuals have a reasonably Gaussian distribution, but have long tails. A &lt;em>Quantile-Quantile&lt;/em> plot will helps us view this in more detail.&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2021-07-05-bandwidth-seasonal-decomposition/index_files/figure-html/unnamed-chunk-11-1.png" width="672" />&lt;/p>
&lt;p>This plot compares the distribution of our residuals against a Gaussian distribution. If our residuals were Gaussian, all points would lie on the 45 degree line. We see that for one standard deviation around the mean fo egress direction, and two standard deviations for the ingress direction, our residuals are reasonably Gaussian. However outside of this we start to see the long tails of our data.&lt;/p>
&lt;p>The distribution of the residuals doesn&amp;rsquo;t affect our forecasts, but it does affect our prediction intervals around the forecasts which assume a Gaussian distribution of the residuals. From what we&amp;rsquo;ve seen here we could be comfortable in around a 70% to 80% confidence interval around our forecasts, but anything higher breaks our assumptions and thus could not be relied upon.&lt;/p>
&lt;h1 id="forecasting">Forecasting&lt;/h1>
&lt;p>Now that we&amp;rsquo;ve decomposed out series, we can use this as a way to forecast our series into the future. This is done by forecasting the seasonal component and and the seasonally adjusted components separately. These two separate forecasts can then be added together to form our single forecast. The prediction intervals are &amp;lsquo;reseasonalised&amp;rsquo; in a similar way by adding the seasonal forecasts to the upper and lower limits of the prediction intervals.&lt;/p>
&lt;p>We&amp;rsquo;re going to use two very simple models to forecast the components. To forecast the seasonally adjusted data we&amp;rsquo;ll use a &lt;em>naive&lt;/em> method, which sets all forecasts to be the value of the last observation. To forecast the seasonal component, we&amp;rsquo;ll use a &lt;em>seasonal naive&lt;/em> method, which sets the forecast to be equal to the last observed value from the same season, in our case a season being one day.&lt;/p>
&lt;p>The &lt;code>decomposition_model()&lt;/code> function from the &lt;code>fabletools&lt;/code> package does a lot of the heavy lifting for us, and we then forecast our throughput for the next week:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># Decompose and model and forecast&lt;/span>
throughput_dcmp_fc &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
throughput_train &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">model&lt;/span>(
SNAIVE &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">decomposition_model&lt;/span>(
STL_tp,
&lt;span style="color:#a6e22e">NAIVE&lt;/span>(season_adjust),
&lt;span style="color:#a6e22e">SNAIVE&lt;/span>(`season_1 day` &lt;span style="color:#f92672">~&lt;/span> &lt;span style="color:#a6e22e">lag&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;1 day&amp;#39;&lt;/span>))
),
) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">forecast&lt;/span>(h &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;7 days&amp;#39;&lt;/span>)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Let&amp;rsquo;s take a look at the forecast with a 70% prediction interval, and compare it to our test data which holds the real values for the next week.&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2021-07-05-bandwidth-seasonal-decomposition/index_files/figure-html/unnamed-chunk-13-1.png" width="672" />&lt;/p>
&lt;p>Using two of the most simple modelling methods, we&amp;rsquo;ve been able to forecast the throughput reasonably well. The forecasts for our egress direction are a little off due to the larger variance of the data, but the ingress direction forecasts are quite close to our test data. It&amp;rsquo;s also comforting that all of the test data sits comfortably within our 70% prediction interval.&lt;/p>
&lt;p>We&amp;rsquo;ll use two metrics to gauge the accuracy of our forecasts. The first one is &lt;em>mean absoute error (MAE)&lt;/em>, which is the mean of the absolute value of the difference between our forecasts and the actual data. MAE is nice because it&amp;rsquo;s in the same units as our y-axis. The second is &lt;em>mean absolute percentage error (MAPE)&lt;/em>, which gives our error as a percentage of the actual test data. I should note that MAPE doesn&amp;rsquo;t work in all circumstances; for example it&amp;rsquo;s undefined if any of our test values are 0.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">throughput_dcmp_fc &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">accuracy&lt;/span>(
throughput_test,
measures &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">list&lt;/span>(
MAE &lt;span style="color:#f92672">=&lt;/span> MAE,
MAPE &lt;span style="color:#f92672">=&lt;/span> MAPE
)
)
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code># A tibble: 2 × 5
.model direction .type MAE MAPE
&amp;lt;chr&amp;gt; &amp;lt;chr&amp;gt; &amp;lt;chr&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt;
1 SNAIVE egress Test 9.95 14.8
2 SNAIVE ingress Test 16.0 6.50
&lt;/code>&lt;/pre>&lt;p>For our egress direction our forecasts are out on average by 9.95 megabits per second, or 14.83 percent. With the reduced variability of the data our ingress forecasts are better. We&amp;rsquo;re only out on average by 16.04 megabits per second, or 6.5 percent.&lt;/p>
&lt;h1 id="summary">Summary&lt;/h1>
&lt;p>In this article we&amp;rsquo;ve looked at analysing and forecasting network throughput. We analysed the data to confirm our assumptions about its seasonality, then used STL to decompose it into its seasonal, trend-cycle, and remainder components. We applied very simple modelling methods on each of these components and used these to forecast the next week of values. Comparing these values to the actual test data we found we were reasonably accurate with our forecasts, even with the simple methods used.&lt;/p>
&lt;p>In a real world situation we&amp;rsquo;re probably not concerned with what our throughput utilising is next week. We&amp;rsquo;d be more interested about what the throughput will be in one or two years so we can make better decisions on investment. This doesn&amp;rsquo;t mean the simple models we&amp;rsquo;ve shown here aren&amp;rsquo;t useful, just that our training data would likely need to be larger to detect long term trend and cycle components and changes in daily seasonality. We may also want to try out some more complex models, test out their accuracy using cross-validation on our training data, and use bootstrapping to better determine the prediction intervals of the forecasts. But I&amp;rsquo;d hasten to add that more complex models aren&amp;rsquo;t always the answer, and that as we&amp;rsquo;ve shown here you can get very usable forecasts and prediction intervals with parsimonious models.&lt;/p></description></item><item><title>Whitespacer</title><link>https://clt.blog.foletta.net/post/2021-06-21-whitespacer/</link><pubDate>Wed, 30 Jun 2021 00:00:00 +0000</pubDate><guid>https://clt.blog.foletta.net/post/2021-06-21-whitespacer/</guid><description>&lt;p>A few weeks ago I was analysing some packet captures and &lt;a href="https://datatracker.ietf.org/doc/html/rfc2616#section-2.2">thanking the RFC gods&lt;/a> that HTTP - and many other protocols - use ASCII/UTF-8 rather than packing everything into binary.&lt;/p>
&lt;p>I then started thinking how confusing it would be to look at a packet capture and only see whitespace in the conversation. And thus &lt;strong>whitespacer&lt;/strong> was born: a utility to encode messages into pure, soft, clean whitespace.&lt;/p>
&lt;h1 id="how-it-works">How It Works&lt;/h1>
&lt;p>Whitespacer has two modes: encoding and decoding. In encoding mode, it takes bytes from standard in, encodes them as whitespace characters, and writes them to standard out. Using the &amp;lsquo;-d&amp;rsquo; switch on the command line moves it to encoding mode, taking the whitespace characters and decoding them back to the original bytes. You can find the full code (writen in C) on my &lt;a href="https://github.com/gregfoletta/whitespacer">github repo&lt;/a>.&lt;/p>
&lt;p>The encoding is a simple base-4 encoding. Each byte has its four groups of two bits encoded into one of four whitespace characters:&lt;/p>
&lt;ul>
&lt;li>00 -&amp;gt; &amp;lsquo;\t&amp;rsquo;&lt;/li>
&lt;li>01 -&amp;gt; &amp;lsquo;\n&amp;rsquo;&lt;/li>
&lt;li>10 -&amp;gt; &amp;lsquo;\r&amp;rsquo;&lt;/li>
&lt;li>11 -&amp;gt; ' ' (space)&lt;/li>
&lt;/ul>
&lt;p>In this article I&amp;rsquo;ll take you through the program itself, then some sections of code I found interesting to write.&lt;/p>
&lt;h1 id="the-program">The Program&lt;/h1>
&lt;p>The code has been compiled into an ELF file called &amp;lsquo;ws&amp;rsquo;. Let&amp;rsquo;s see what the encoding looks like piped into hexdump:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-bash" data-lang="bash">echo &lt;span style="color:#e6db74">&amp;#34;Hello World!&amp;#34;&lt;/span> | ./ws | hexdump
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>0000000 0d09 0a09 0a0a 0a0d 2009 0a0d 2009 0a0d
0000010 2020 0a0d 0909 090d 0a20 0a0a 2020 0a0d
0000020 090d 0a20 2009 0a0d 0a09 0a0d 090a 090d
0000030 0d0d 0909
0000034
&lt;/code>&lt;/pre>&lt;p>You can see that the only bytes present are 0x9, 0xa, 0xd and 0x20: our whitespace characters. Encoding and then decoding (as it should) gives us our original string back.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-bash" data-lang="bash">echo &lt;span style="color:#e6db74">&amp;#34;Hello World!&amp;#34;&lt;/span> | ./ws | ./ws -d
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>Hello World!
&lt;/code>&lt;/pre>&lt;h2 id="correctness">Correctness&lt;/h2>
&lt;p>Let&amp;rsquo;s test it for correctness. We&amp;rsquo;ll generate a 128Kb file filled with random bytes and run these through the encoder/decoder.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-bash" data-lang="bash">&lt;span style="color:#75715e"># Create a 1Mb file full of random data&lt;/span>
dd &lt;span style="color:#66d9ef">if&lt;/span>&lt;span style="color:#f92672">=&lt;/span>/dev/urandom of&lt;span style="color:#f92672">=&lt;/span>urandom bs&lt;span style="color:#f92672">=&lt;/span>1KB count&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#ae81ff">128&lt;/span>
&lt;span style="color:#75715e"># Run the file through the encoder/decoder&lt;/span>
./ws &amp;lt; urandom | ./ws -d &amp;gt; urandom.transfer
&lt;span style="color:#75715e"># Are the files the same?&lt;/span>
md5sum urandom urandom.transfer
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>128+0 records in
128+0 records out
128000 bytes (128 kB, 125 KiB) copied, 0.00582111 s, 22.0 MB/s
0f92a99dd3e2556acf41097a9fc74037 urandom
0f92a99dd3e2556acf41097a9fc74037 urandom.transfer
&lt;/code>&lt;/pre>&lt;p>The MD5 hashes are the same, implying it&amp;rsquo;s encoding and decoding each byte correctly. We can be doubly sure by looking at the distribution of byte values in the random file using a little bit of R.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#a6e22e">tibble&lt;/span>(
bytes &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">readBin&lt;/span>(
&lt;span style="color:#a6e22e">file&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;urandom&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;rb&amp;#39;&lt;/span>),
what &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;integer&amp;#39;&lt;/span>,
size &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>,
signed &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#66d9ef">TRUE&lt;/span>,
n &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">file.size&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;urandom&amp;#39;&lt;/span>)
)
) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">ggplot&lt;/span>() &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">geom_histogram&lt;/span>(&lt;span style="color:#a6e22e">aes&lt;/span>(bytes), binwidth &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">labs&lt;/span>(
title &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Random Bytes From /dev/urandom&amp;#39;&lt;/span>,
subtitle &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Byte Distribution (total)&amp;#39;&lt;/span>,
x &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Byte Value&amp;#39;&lt;/span>,
y &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Frequency&amp;#39;&lt;/span>
)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2021-06-21-whitespacer/index_files/figure-html/unnamed-chunk-6-1.png" width="672" />&lt;/p>
&lt;p>So we know that it&amp;rsquo;s encoded and decoded every byte value successfully. We now also know that /dev/urandom appears to be uniformly distributed.&lt;/p>
&lt;h2 id="performance">Performance&lt;/h2>
&lt;p>We&amp;rsquo;ve tested it for correctness, but how fast does it run? We create a 3Gb file, ensure it&amp;rsquo;s cached in memory using &lt;em>fincore&lt;/em>, available in the &lt;a href="https://github.com/david415/linux-ftools">linux-ftools&lt;/a> package. Unfortunately fincore doesn&amp;rsquo;t columnate very well, the last value is the percentage of the file in the cache.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-bash" data-lang="bash">&lt;span style="color:#75715e"># Create a 3Gb file full of random data&lt;/span>
dd &lt;span style="color:#66d9ef">if&lt;/span>&lt;span style="color:#f92672">=&lt;/span>/dev/zero of&lt;span style="color:#f92672">=&lt;/span>zero bs&lt;span style="color:#f92672">=&lt;/span>100M count&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#ae81ff">30&lt;/span>
&lt;span style="color:#75715e"># Read the file to add it to the cache&lt;/span>
cat zero &amp;gt; /dev/null
&lt;span style="color:#75715e"># Confirm the file is in the cache&lt;/span>
fincore --pages&lt;span style="color:#f92672">=&lt;/span>false zero
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>30+0 records in
30+0 records out
3145728000 bytes (3.1 GB, 2.9 GiB) copied, 2.82209 s, 1.1 GB/s
filename size total pages cached pages cached size cached percentage
zero 3145728000 768000 768000 3145728000 100.000000
&lt;/code>&lt;/pre>&lt;p>Next we perform a baseine without our encoder/decoder, measuring the throughput using the &lt;em>pipeviewer&lt;/em> utility.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-bash" data-lang="bash">&lt;span style="color:#75715e"># Baseline without encode/decode&lt;/span>
cat zero | pv -fa &amp;gt; /dev/null
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>[2.23GiB/s]
[2.33GiB/s]
&lt;/code>&lt;/pre>&lt;p>Now we insert whitespacer into the pipeline:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-bash" data-lang="bash">&lt;span style="color:#75715e"># Check the throughput&lt;/span>
cat zero | ./ws | ./ws -d | pv -fa &amp;gt; /dev/null
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>[ 237MiB/s]
[ 270MiB/s]
[ 254MiB/s]
[ 254MiB/s]
[ 262MiB/s]
[ 269MiB/s]
[ 271MiB/s]
[ 265MiB/s]
[ 271MiB/s]
[ 269MiB/s]
[ 272MiB/s]
[ 272MiB/s]
&lt;/code>&lt;/pre>&lt;p>It&amp;rsquo;s a significant hit to throughput, but I don&amp;rsquo;t think it&amp;rsquo;s too bad.&lt;/p>
&lt;h2 id="wireshark">Wireshark&lt;/h2>
&lt;p>Finally, at the start I talked about what it would look like in Wireshark, so let&amp;rsquo;s look at that. We set up a TCP listener on port 8080 piping to out decoder. We then enode a string and send it down this TCP session. We&amp;rsquo;ll take a packet capture at the same time.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-bash" data-lang="bash">&lt;span style="color:#75715e"># TCP listener and decoder&lt;/span>
nc -l &lt;span style="color:#ae81ff">8080&lt;/span> | ./ws -d &amp;amp;
&lt;span style="color:#75715e"># Encode, connect and send&lt;/span>
echo &lt;span style="color:#e6db74">&amp;#34;This is some text passed through TCP&amp;#34;&lt;/span> | ./ws | nc -N localhost &lt;span style="color:#ae81ff">8080&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>This is some text passed through TCP
&lt;/code>&lt;/pre>&lt;p>Here&amp;rsquo;s what we see in Wireshark using the &amp;lsquo;Follow TCP Stream&amp;rsquo; option:&lt;/p>
&lt;p>&lt;img src="tcp_follow.png" alt="Wireshark TCP Follow">&lt;/p>
&lt;p>Exactly what I was looking for.&lt;/p>
&lt;h1 id="the-code">The Code&lt;/h1>
&lt;p>In this section I want to highlight a couple of sections of the code that I enjoyed writing, or had some nuance to them. This includes the lookup tables, the encoding function, and the read loop.&lt;/p>
&lt;h2 id="lookup-tables">Lookup Tables&lt;/h2>
&lt;p>The lookup table used to encode the whitespace went through a few iterations. In the first version there was no lookup table. Instead I used four whitespace characters with contiguous byte values: &amp;lsquo;\t&amp;rsquo;, &amp;lsquo;\n&amp;rsquo;, &amp;lsquo;\v&amp;rsquo;, &amp;lsquo;\f&amp;rsquo;. This allowed the encoding and decoding to be a simple addition (encoding) and subtraction (decoding) of the lowest whitespace character&amp;rsquo;s value ('\t&amp;rsquo; = 9).&lt;/p>
&lt;p>The downside was that some of these characters weren&amp;rsquo;t rendered in Wireshark as whitespace, but rather as dots. I then moved to using the whitespace used in the current version, but with static encoding and decoding lookup tables. The encoding lookup table was four bytes, and the decoding lookup table was 256 bytes, with a canary value for bytes that aren&amp;rsquo;t valid whitespace in my encoding scheme.&lt;/p>
&lt;p>I then realised I could do this a better way, and ended up with the following code:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-C" data-lang="C">unsgined &lt;span style="color:#66d9ef">char&lt;/span> encode_lookup_tbl[] &lt;span style="color:#f92672">=&lt;/span> { &lt;span style="color:#e6db74">&amp;#39;\t&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;\n&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;\r&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39; &amp;#39;&lt;/span> };
&lt;span style="color:#66d9ef">unsigned&lt;/span> &lt;span style="color:#66d9ef">char&lt;/span> decode_lookup_tbl[&lt;span style="color:#ae81ff">256&lt;/span>];
&lt;span style="color:#66d9ef">void&lt;/span> &lt;span style="color:#a6e22e">alloc_decode_lookup_tbl&lt;/span>(&lt;span style="color:#66d9ef">void&lt;/span>) {
&lt;span style="color:#66d9ef">int&lt;/span> x;
&lt;span style="color:#75715e">//Fill the entire array with our canary
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#66d9ef">for&lt;/span> (x &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span>; x &lt;span style="color:#f92672">&amp;lt;&lt;/span> &lt;span style="color:#ae81ff">256&lt;/span>; x&lt;span style="color:#f92672">++&lt;/span>) {
decode_lookup_tbl[x] &lt;span style="color:#f92672">=&lt;/span> LOOKUP_CANARY;
}
&lt;span style="color:#75715e">//Add the four encoding characters by using the inverse
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#75715e">//of the encoding table
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#66d9ef">for&lt;/span> (x &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span>; x &lt;span style="color:#f92672">&amp;lt;&lt;/span> &lt;span style="color:#ae81ff">4&lt;/span>; x&lt;span style="color:#f92672">++&lt;/span>) {
decode_lookup_tbl[ encode_lookup_tbl[x] ] &lt;span style="color:#f92672">=&lt;/span> x;
}
}
&lt;/code>&lt;/pre>&lt;/div>&lt;p>The &lt;code>encode_lookup_tbl&lt;/code> is a static array of the whitespace characters used in the encoding. But the the decoding lookup table is dynamically generated. It&amp;rsquo;s first filled with a canary value, then we use the inverse of the encoding table to generate the decoding table. I like the elegance of this: if we change the encoding, the decoding is automatically updated.&lt;/p>
&lt;h2 id="encoding">Encoding&lt;/h2>
&lt;p>The encoding also went through a couple of iterations. I first iterated only through the input bytes, and indexed the output bytes using &lt;code>x, x + 1, x + 2&lt;/code>, etc.&lt;/p>
&lt;p>The final encoding function ended up like this:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-C" data-lang="C">ssize_t &lt;span style="color:#a6e22e">ws_encode&lt;/span>(
&lt;span style="color:#66d9ef">const&lt;/span> &lt;span style="color:#66d9ef">unsigned&lt;/span> &lt;span style="color:#66d9ef">char&lt;/span> &lt;span style="color:#f92672">*&lt;/span>bytes_in,
&lt;span style="color:#66d9ef">unsigned&lt;/span> &lt;span style="color:#66d9ef">char&lt;/span> &lt;span style="color:#f92672">*&lt;/span>ws_out,
&lt;span style="color:#66d9ef">const&lt;/span> ssize_t bytes
) {
&lt;span style="color:#66d9ef">int&lt;/span> x, y;
&lt;span style="color:#66d9ef">for&lt;/span> (x &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span>; x &lt;span style="color:#f92672">&amp;lt;&lt;/span> bytes; x&lt;span style="color:#f92672">++&lt;/span>) {
&lt;span style="color:#66d9ef">for&lt;/span> (y &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span>; y &lt;span style="color:#f92672">&amp;lt;&lt;/span> &lt;span style="color:#ae81ff">4&lt;/span>; y&lt;span style="color:#f92672">++&lt;/span>) {
ws_out[(&lt;span style="color:#ae81ff">4&lt;/span> &lt;span style="color:#f92672">*&lt;/span> x) &lt;span style="color:#f92672">+&lt;/span> y] &lt;span style="color:#f92672">=&lt;/span> encode_lookup_tbl[
(bytes_in[x] &lt;span style="color:#f92672">&amp;gt;&amp;gt;&lt;/span> (&lt;span style="color:#ae81ff">2&lt;/span> &lt;span style="color:#f92672">*&lt;/span> y)) &lt;span style="color:#f92672">&amp;amp;&lt;/span> &lt;span style="color:#ae81ff">0x03&lt;/span>
];
}
}
&lt;span style="color:#66d9ef">return&lt;/span> x &lt;span style="color:#f92672">*&lt;/span> &lt;span style="color:#ae81ff">4&lt;/span>;
}
&lt;/code>&lt;/pre>&lt;/div>&lt;p>I moved to a double loop, with &lt;code>x&lt;/code> indexing into the input array, &lt;code>(4 * x ) + y&lt;/code> indexing into the output array. I will admit that while I reduced the lines of code, I&amp;rsquo;ve probably made it much harder to interpret.&lt;/p>
&lt;h2 id="reading">Reading&lt;/h2>
&lt;p>The reading loop when encoding is really simple: read bytes from standard in into a buffer, encode these bytes, then write them to standard out.&lt;/p>
&lt;p>However decoding is a little more nuanced. The challenge is that there&amp;rsquo;s no guarantee how many bytes a &lt;code>read()&lt;/code> system call will return; it could be anything between 1 to the &lt;code>size_t count&lt;/code> variable you pass to it. For decoding, our decoded bytes come in as four byte blocks of whitespace. We need to make sure that we&amp;rsquo;re not passing a split block into our decoder.&lt;/p>
&lt;p>To ensure this doesn&amp;rsquo;t occur, I keep track of the bytes read on each read call, and our position in the read buffer. If the number of bytes read is a multiple of four, we&amp;rsquo;re fine. If not, we need to go back and read more bytes until it is&lt;/p>
&lt;ol>
&lt;li>A multiple of four, or&lt;/li>
&lt;li>We&amp;rsquo;ve filled our input buffer, which is also a multiple of four bytes&lt;/li>
&lt;/ol>
&lt;p>I won&amp;rsquo;t post the read loop code, but you can see it &lt;a href="https://github.com/gregfoletta/whitespacer/blob/master/main.c#L46">here&lt;/a>.&lt;/p>
&lt;h1 id="summary">Summary&lt;/h1>
&lt;p>Is this a useful program? Probably not. Was it very difficult to write? There was some nuance, but it wasn&amp;rsquo;t too hard. It was however a fun project with a small and clearly defined scope. This made it enjoyable to work on in the handful of spare hours available to me.&lt;/p></description></item><item><title>What the #!</title><link>https://clt.blog.foletta.net/post/2021-04-19-what-the/</link><pubDate>Tue, 11 May 2021 00:00:00 +0000</pubDate><guid>https://clt.blog.foletta.net/post/2021-04-19-what-the/</guid><description>&lt;p>Working with computers you take a lot for granted. You assume your press of the keyboard will bubble up through the kernel to your terminal, your HTTP request will remain intact after travelling halfway across the globe, and that your stream of a cat video will be decoded and rendered on your screen. Taking these things for granted isn&amp;rsquo;t a negative, in fact quite the opposite. The countless abstractions and indirections that hide the internal details of a computer are the reason that people - computer science degree or not - can use them in some shape or form.&lt;/p>
&lt;p>But at times it&amp;rsquo;s an unsatisfying feeling not knowing how something is working under the hood, and there&amp;rsquo;s a want to &amp;ldquo;pay some attention to the person behind the curtain&amp;rdquo;. This happened recently to me when writing a script and adding the obligatory hashbang (#!) to the first line. I&amp;rsquo;ve done this hundreds of times before and know that this specifies the interpreter (and optional arguments) that run the rest of the file, but I wanted to understand how this worked: where is the line parsed, and how is the interpreter run?&lt;/p>
&lt;p>So in this article, join me for a dive through user and kernel space in order to answer the question:&lt;/p>
&lt;blockquote>
&lt;p>How is an interpreter called when specified using a hashbang in the first line of a script?&lt;/p>
&lt;/blockquote>
&lt;h1 id="two-notes">Two Notes&lt;/h1>
&lt;p>When discussing kernel components, we&amp;rsquo;ll be using the following version of Linux:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sh" data-lang="sh">uname -sor
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>Linux 4.15.0-142-generic GNU/Linux
&lt;/code>&lt;/pre>&lt;p>In places I&amp;rsquo;ll be pasting snippets of C from the kernel source code. The code snippets won&amp;rsquo;t include things such as error checking, locking, and other items I deem only tangentially related the core question of this article. I will include a link to the full source of each function, and elipses (&amp;hellip;) will be added to show you that code has been removed.&lt;/p>
&lt;h1 id="execution-and-userspace">Execution and Userspace&lt;/h1>
&lt;p>Let&amp;rsquo;s dive in - here&amp;rsquo;s an example Perl script that we&amp;rsquo;ll run, and the modification of its permission to allow it to be executed.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sh" data-lang="sh">cat data/foo.pl
chmod u+x data/foo.pl
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
use Data::Dumper;
print Dumper [@ARGV];
&lt;/code>&lt;/pre>&lt;p>The first investigative tool we&amp;rsquo;ll use is &lt;code>strace&lt;/code>, which attaches itself to a process and intercepts system calls. In the below code snippet, we run strace with the -e argument to filter out all but the system calls we&amp;rsquo;re interested in. I&amp;rsquo;ve done this for brevity within the article, but you&amp;rsquo;d likely want to look through the whole trace to get a firm idea about what the process is doing.&lt;/p>
&lt;p>A bash process is run, which then executes our script:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sh" data-lang="sh">strace -e trace&lt;span style="color:#f92672">=&lt;/span>vfork,fork,clone,execve bash -c &lt;span style="color:#e6db74">&amp;#39;./data/foo.pl argument_1&amp;#39;&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>execve(&amp;quot;/bin/bash&amp;quot;, [&amp;quot;bash&amp;quot;, &amp;quot;-c&amp;quot;, &amp;quot;./data/foo.pl argument_1&amp;quot;], 0x7ffffdf84020 /* 100 vars */) = 0
execve(&amp;quot;./data/foo.pl&amp;quot;, [&amp;quot;./data/foo.pl&amp;quot;, &amp;quot;argument_1&amp;quot;], 0x563c6922a890 /* 100 vars */) = 0
$VAR1 = [
'argument_1'
];
+++ exited with 0 +++
&lt;/code>&lt;/pre>&lt;p>The strace utility shows us two processes executions: the initial bash shell (which will have followed the &lt;code>clone()&lt;/code> call from the original shell and not captured), then the path of our script being passed directly to the &lt;code>execve()&lt;/code> system call. This is the system call that executes processes. It&amp;rsquo;s prototype is:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-c" data-lang="c">&lt;span style="color:#66d9ef">int&lt;/span> &lt;span style="color:#a6e22e">execve&lt;/span>(
&lt;span style="color:#66d9ef">const&lt;/span> &lt;span style="color:#66d9ef">char&lt;/span> &lt;span style="color:#f92672">*&lt;/span>filename,
&lt;span style="color:#66d9ef">char&lt;/span> &lt;span style="color:#f92672">*&lt;/span>&lt;span style="color:#66d9ef">const&lt;/span> argv[],
&lt;span style="color:#66d9ef">char&lt;/span> &lt;span style="color:#f92672">*&lt;/span>&lt;span style="color:#66d9ef">const&lt;/span> envp[]
);
&lt;/code>&lt;/pre>&lt;/div>&lt;p>with &lt;code>*filename&lt;/code> containing the path to the program to run, &lt;code>*argv[]&lt;/code> containing the command line arguments, and &lt;code>*envp[]&lt;/code> containing the environment variables.&lt;/p>
&lt;p>What does this tell us? It tells us that the scripts are passed directly this system call, and it&amp;rsquo;s not the bash process that parses or acts on the hash-bang line.&lt;/p>
&lt;h1 id="a-quick-look-in-glibc">A Quick Look in glibc&lt;/h1>
&lt;p>We&amp;rsquo;ll stay in userspace a little longer, as the bash process doesn&amp;rsquo;t call the system call directly. The &lt;code>execve()&lt;/code> function is part of the standard C library (on my machine it&amp;rsquo;s glibc), to which bash is dynamically linked. While it&amp;rsquo;s unlikely that anything of significance is occurring in the library, let&amp;rsquo;s be rigorous and take a look.&lt;/p>
&lt;p>We can use the &lt;code>ldd&lt;/code> utility to print out the dynamic libraries linked at runtime by the dynamic linker:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-bash" data-lang="bash">ldd &lt;span style="color:#66d9ef">$(&lt;/span>which bash&lt;span style="color:#66d9ef">)&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code> linux-vdso.so.1 (0x00007ffea7d8c000)
libtinfo.so.5 =&amp;gt; /lib/x86_64-linux-gnu/libtinfo.so.5 (0x00007f289d576000)
libdl.so.2 =&amp;gt; /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f289d372000)
libc.so.6 =&amp;gt; /lib/x86_64-linux-gnu/libc.so.6 (0x00007f289cf81000)
/lib64/ld-linux-x86-64.so.2 (0x00007f289daba000)
&lt;/code>&lt;/pre>&lt;p>We can see that libc on my machine is located at &lt;em>/lib/x86_64-linux-gnu/libc.so.6&lt;/em>. Using &lt;code>objdump&lt;/code> we can disassemble the shared library, and we extract out the section that&amp;rsquo;s related to &lt;code>execve()&lt;/code>.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-bash" data-lang="bash">objdump -d /lib/x86_64-linux-gnu/libc.so.6 | sed -n &lt;span style="color:#e6db74">&amp;#39;/^[[:xdigit:]]\+ &amp;lt;execve/,/^$/p&amp;#39;&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>00000000000e4c00 &amp;lt;execve@@GLIBC_2.2.5&amp;gt;:
e4c00: b8 3b 00 00 00 mov $0x3b,%eax
e4c05: 0f 05 syscall
e4c07: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax
e4c0d: 73 01 jae e4c10 &amp;lt;execve@@GLIBC_2.2.5+0x10&amp;gt;
e4c0f: c3 retq
e4c10: 48 8b 0d 51 62 30 00 mov 0x306251(%rip),%rcx # 3eae68 &amp;lt;h_errlist@@GLIBC_2.2.5+0xdc8&amp;gt;
e4c17: f7 d8 neg %eax
e4c19: 64 89 01 mov %eax,%fs:(%rcx)
e4c1c: 48 83 c8 ff or $0xffffffffffffffff,%rax
e4c20: c3 retq
e4c21: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
e4c28: 00 00 00
e4c2b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
&lt;/code>&lt;/pre>&lt;p>As expected, the standard library doesn&amp;rsquo;t deal with the hashbang line either. It places 0x3b (decimal 59) - the &lt;a href="https://elixir.bootlin.com/linux/latest/source/arch/x86/entry/syscalls/syscall_64.tbl#L70">execve system call number&lt;/a>- into the &lt;code>eax&lt;/code> reigster and calls the &lt;a href="https://www.felixcloutier.com/x86/syscall">fast system call x86 instruction&lt;/a>. The instructions following the &lt;code>syscall&lt;/code> instruction deal with error handling on the return from the kernel.&lt;/p>
&lt;p>It would take another whole article to dive into the process of jumping from user space into the kernel through a system call. I&amp;rsquo;ve provided a brief overview in an appendix at the bottom of this article. Instead, we&amp;rsquo;ll jump straight to the &lt;code>execve()&lt;/code> system call definition in the kernel.&lt;/p>
&lt;h1 id="delving-into-the-kernel">Delving Into the Kernel&lt;/h1>
&lt;p>We skip over the first few function calls which only tangentially relate to our question we&amp;rsquo;re trying to answer: &lt;a href="https://elixir.bootlin.com/linux/v4.15/source/fs/exec.c#L1923">SYSCALL_DEFINE3(execve)&lt;/a> calls &lt;a href="https://elixir.bootlin.com/linux/v4.15/source/fs/exec.c#L1841">do_execve()&lt;/a> which calls&lt;a href="https://elixir.bootlin.com/linux/v4.15/source/fs/exec.c#L1694">do_execveat_common()&lt;/a>.&lt;/p>
&lt;p>It&amp;rsquo;s at this point where we start to see some items of interest:&lt;/p>
&lt;ul>
&lt;li>The allocation of the &lt;a href="https://elixir.bootlin.com/linux/v4.15/source/include/linux/binfmts.h#L17">linux_binprm&lt;/a> structure, which is the primary structure we&amp;rsquo;ll be concerned with in this article. The members we&amp;rsquo;re focused on are:
&lt;ul>
&lt;li>&lt;code>char buf[BINPRM_BUF_SIZE]&lt;/code> - holds first 128 bytes of the file being executed.&lt;/li>
&lt;li>&lt;code>int argc, envc&lt;/code> - our command line argument and environment counts&lt;/li>
&lt;li>&lt;code>const char * filename&lt;/code> - the name of the binary that&amp;rsquo;s seen by the &amp;lsquo;ps&amp;rsquo; utility.&lt;/li>
&lt;li>&lt;code>const char * interp&lt;/code> - name of the binary that was really executed.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>The opening of file based on the path passed to the system call.&lt;/li>
&lt;li>Initialisation of some temporary stack space to hold the command line arguments and environment variables.&lt;/li>
&lt;li>Copying of the first 128 bytes of the executed file to a buffer.&lt;/li>
&lt;li>Counting of the number of argument and environment variables.&lt;/li>
&lt;li>Copying of the argument and environment variables on to the stack.&lt;/li>
&lt;/ul>
&lt;p>Here&amp;rsquo;s the code with my comments inline:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-C" data-lang="C">&lt;span style="color:#66d9ef">static&lt;/span> &lt;span style="color:#66d9ef">int&lt;/span> &lt;span style="color:#a6e22e">do_execveat_common&lt;/span>(&lt;span style="color:#66d9ef">int&lt;/span> fd, &lt;span style="color:#66d9ef">struct&lt;/span> filename &lt;span style="color:#f92672">*&lt;/span>filename,
&lt;span style="color:#66d9ef">struct&lt;/span> user_arg_ptr argv,
&lt;span style="color:#66d9ef">struct&lt;/span> user_arg_ptr envp,
&lt;span style="color:#66d9ef">int&lt;/span> flags)
{
&lt;span style="color:#75715e">//The most pertinent structure
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#66d9ef">struct&lt;/span> linux_binprm &lt;span style="color:#f92672">*&lt;/span>bprm;
...
&lt;span style="color:#75715e">//Allocate space for the structure
&lt;/span>&lt;span style="color:#75715e">&lt;/span> bprm &lt;span style="color:#f92672">=&lt;/span> kzalloc(&lt;span style="color:#66d9ef">sizeof&lt;/span>(&lt;span style="color:#f92672">*&lt;/span>bprm), GFP_KERNEL);
...
&lt;span style="color:#75715e">//Open our file that&amp;#39;s being executed, and add it
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#75715e">//to the bprm structure
&lt;/span>&lt;span style="color:#75715e">&lt;/span> file &lt;span style="color:#f92672">=&lt;/span> do_open_execat(fd, filename, flags);
bprm&lt;span style="color:#f92672">-&amp;gt;&lt;/span>file &lt;span style="color:#f92672">=&lt;/span> file;
&lt;span style="color:#75715e">//Copy the name of the file to the structure
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#75715e">//The &amp;#39;else&amp;#39; deals with situations where a
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#75715e">//file descriptor (/dev/fd/*) has been passed
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#75715e">//to the execve call.
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#66d9ef">if&lt;/span> (fd &lt;span style="color:#f92672">==&lt;/span> AT_FDCWD &lt;span style="color:#f92672">||&lt;/span> filename&lt;span style="color:#f92672">-&amp;gt;&lt;/span>name[&lt;span style="color:#ae81ff">0&lt;/span>] &lt;span style="color:#f92672">==&lt;/span> &lt;span style="color:#e6db74">&amp;#39;/&amp;#39;&lt;/span>) {
bprm&lt;span style="color:#f92672">-&amp;gt;&lt;/span>filename &lt;span style="color:#f92672">=&lt;/span> filename&lt;span style="color:#f92672">-&amp;gt;&lt;/span>name;
} &lt;span style="color:#66d9ef">else&lt;/span> {
...
}
&lt;span style="color:#75715e">//At this stage, our interpreter is the same
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#75715e">//as the file being executed.
&lt;/span>&lt;span style="color:#75715e">&lt;/span> bprm&lt;span style="color:#f92672">-&amp;gt;&lt;/span>interp &lt;span style="color:#f92672">=&lt;/span> bprm&lt;span style="color:#f92672">-&amp;gt;&lt;/span>filename;
&lt;span style="color:#75715e">//Create some temporary stack space
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#75715e">//to copy the command and environment
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#75715e">//variables to
&lt;/span>&lt;span style="color:#75715e">&lt;/span> retval &lt;span style="color:#f92672">=&lt;/span> bprm_mm_init(bprm);
&lt;span style="color:#75715e">//Count the argument variables.
&lt;/span>&lt;span style="color:#75715e">&lt;/span> bprm&lt;span style="color:#f92672">-&amp;gt;&lt;/span>argc &lt;span style="color:#f92672">=&lt;/span> count(argv, MAX_ARG_STRINGS);
&lt;span style="color:#75715e">//Count the environment variables.
&lt;/span>&lt;span style="color:#75715e">&lt;/span> bprm&lt;span style="color:#f92672">-&amp;gt;&lt;/span>envc &lt;span style="color:#f92672">=&lt;/span> count(envp, MAX_ARG_STRINGS);
&lt;span style="color:#75715e">//In this function, the bprm-&amp;gt;buf character
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#75715e">//array is zeroed out, and the first 128 bytes
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#75715e">//of the file are copied into it.
&lt;/span>&lt;span style="color:#75715e">&lt;/span> retval &lt;span style="color:#f92672">=&lt;/span> prepare_binprm(bprm);
&lt;span style="color:#75715e">//Copy the filename on to the stack, which becomes
&lt;/span>&lt;span style="color:#75715e">&lt;/span> retval &lt;span style="color:#f92672">=&lt;/span> copy_strings_kernel(&lt;span style="color:#ae81ff">1&lt;/span>, &lt;span style="color:#f92672">&amp;amp;&lt;/span>bprm&lt;span style="color:#f92672">-&amp;gt;&lt;/span>filename, bprm);
&lt;span style="color:#75715e">//Copy the environment variables on to the temporary
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#75715e">//stack
&lt;/span>&lt;span style="color:#75715e">&lt;/span> retval &lt;span style="color:#f92672">=&lt;/span> copy_strings(bprm&lt;span style="color:#f92672">-&amp;gt;&lt;/span>envc, envp, bprm);
&lt;span style="color:#75715e">//Copy the command line arguments on to the
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#75715e">//temporary stack
&lt;/span>&lt;span style="color:#75715e">&lt;/span> retval &lt;span style="color:#f92672">=&lt;/span> copy_strings(bprm&lt;span style="color:#f92672">-&amp;gt;&lt;/span>argc, argv, bprm);
...
retval &lt;span style="color:#f92672">=&lt;/span> exec_binprm(bprm);
...
}
&lt;/code>&lt;/pre>&lt;/div>&lt;p>The next function called, &lt;a href="https://elixir.bootlin.com/linux/v4.15/source/fs/exec.c#L1669">exec_binprm()&lt;/a>, has as its main responsibility the calling of &lt;a href="https://elixir.bootlin.com/linux/v4.15/source/fs/exec.c#L1616">search_binary_handler()&lt;/a>. Let&amp;rsquo;s look at how that&amp;rsquo;s dealt with.&lt;/p>
&lt;h1 id="binary-handler-search">Binary Handler Search&lt;/h1>
&lt;p>The binary handler is responsible for iterating through the list of supported binary formats, and dispatching the &lt;code>load_binary()&lt;/code> function of each one.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-C" data-lang="C">&lt;span style="color:#66d9ef">int&lt;/span> &lt;span style="color:#a6e22e">search_binary_handler&lt;/span>(&lt;span style="color:#66d9ef">struct&lt;/span> linux_binprm &lt;span style="color:#f92672">*&lt;/span>bprm)
{
&lt;span style="color:#66d9ef">struct&lt;/span> linux_binfmt &lt;span style="color:#f92672">*&lt;/span>fmt;
&lt;span style="color:#66d9ef">int&lt;/span> retval;
...
&lt;span style="color:#75715e">/* This allows 4 levels of binfmt rewrites before failing hard. */&lt;/span>
&lt;span style="color:#66d9ef">if&lt;/span> (bprm&lt;span style="color:#f92672">-&amp;gt;&lt;/span>recursion_depth &lt;span style="color:#f92672">&amp;gt;&lt;/span> &lt;span style="color:#ae81ff">5&lt;/span>)
&lt;span style="color:#66d9ef">return&lt;/span> &lt;span style="color:#f92672">-&lt;/span>ELOOP;
...
list_for_each_entry(fmt, &lt;span style="color:#f92672">&amp;amp;&lt;/span>formats, lh) {
...
bprm&lt;span style="color:#f92672">-&amp;gt;&lt;/span>recursion_depth&lt;span style="color:#f92672">++&lt;/span>;
retval &lt;span style="color:#f92672">=&lt;/span> fmt&lt;span style="color:#f92672">-&amp;gt;&lt;/span>load_binary(bprm);
bprm&lt;span style="color:#f92672">-&amp;gt;&lt;/span>recursion_depth&lt;span style="color:#f92672">--&lt;/span>;
...
}
...
&lt;span style="color:#66d9ef">return&lt;/span> retval;
}
&lt;/code>&lt;/pre>&lt;/div>&lt;p>The &lt;code>formats&lt;/code> global variable is a linked list of &lt;a href="https://elixir.bootlin.com/linux/v4.15/source/include/linux/binfmts.h#L92">linux_binfmt&lt;/a> structures. Some of these are defined in the kernel, but they can also be loaded via loadable kernel modules. These are registered using the &lt;a href="https://elixir.bootlin.com/linux/v4.15/C/ident/register_binfmt">register_binfmt()&lt;/a> function. The built-in These include the common ELF format, the older a.out format, but the one we are most interested in is the &amp;lsquo;script&amp;rsquo; format.&lt;/p>
&lt;p>Here&amp;rsquo;s the structure and the code code that registers the script format:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-c" data-lang="c">&lt;span style="color:#66d9ef">static&lt;/span> &lt;span style="color:#66d9ef">struct&lt;/span> linux_binfmt script_format &lt;span style="color:#f92672">=&lt;/span> {
.module &lt;span style="color:#f92672">=&lt;/span> THIS_MODULE,
.load_binary &lt;span style="color:#f92672">=&lt;/span> load_script,
};
&lt;span style="color:#66d9ef">static&lt;/span> &lt;span style="color:#66d9ef">int&lt;/span> __init &lt;span style="color:#a6e22e">init_script_binfmt&lt;/span>(&lt;span style="color:#66d9ef">void&lt;/span>)
{
register_binfmt(&lt;span style="color:#f92672">&amp;amp;&lt;/span>script_format);
&lt;span style="color:#66d9ef">return&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span>;
}
&lt;/code>&lt;/pre>&lt;/div>&lt;p>We can see that for a script, the &lt;code>load_binary&lt;/code> function pointer that the &lt;code>search_binary_handler()&lt;/code> function will dispatch points to the &lt;a href="https://elixir.bootlin.com/linux/v4.15/source/fs/binfmt_script.c#L17">load_script()&lt;/a> function. Let&amp;rsquo;s now turn our attention to this.&lt;/p>
&lt;h1 id="script-binary-format">Script Binary Format&lt;/h1>
&lt;p>The &lt;code>load_script()&lt;/code> function must first determine whether it is the appropriate handler for the file that&amp;rsquo;s being executed. The &lt;code>binprm&lt;/code> structure has the first 128 bytes of the file to be executed in the &lt;code>buf&lt;/code> member. It looks at the first two bytes and checks whether they are the hash-bang. If not then it returns &lt;code>-ENOEXEC&lt;/code>.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-c" data-lang="c">&lt;span style="color:#66d9ef">static&lt;/span> &lt;span style="color:#66d9ef">int&lt;/span> &lt;span style="color:#a6e22e">load_script&lt;/span>(&lt;span style="color:#66d9ef">struct&lt;/span> linux_binprm &lt;span style="color:#f92672">*&lt;/span>bprm)
{
&lt;span style="color:#66d9ef">const&lt;/span> &lt;span style="color:#66d9ef">char&lt;/span> &lt;span style="color:#f92672">*&lt;/span>i_arg, &lt;span style="color:#f92672">*&lt;/span>i_name;
&lt;span style="color:#66d9ef">char&lt;/span> &lt;span style="color:#f92672">*&lt;/span>cp;
&lt;span style="color:#66d9ef">struct&lt;/span> file &lt;span style="color:#f92672">*&lt;/span>file;
&lt;span style="color:#66d9ef">int&lt;/span> retval;
&lt;span style="color:#66d9ef">if&lt;/span> ((bprm&lt;span style="color:#f92672">-&amp;gt;&lt;/span>buf[&lt;span style="color:#ae81ff">0&lt;/span>] &lt;span style="color:#f92672">!=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;#&amp;#39;&lt;/span>) &lt;span style="color:#f92672">||&lt;/span> (bprm&lt;span style="color:#f92672">-&amp;gt;&lt;/span>buf[&lt;span style="color:#ae81ff">1&lt;/span>] &lt;span style="color:#f92672">!=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;!&amp;#39;&lt;/span>))
&lt;span style="color:#66d9ef">return&lt;/span> &lt;span style="color:#f92672">-&lt;/span>ENOEXEC;
&lt;/code>&lt;/pre>&lt;/div>&lt;p>The rest of the function can be broken down into three parts:&lt;/p>
&lt;ol>
&lt;li>Parsing the interpreter and arguments&lt;/li>
&lt;li>Splitting the interpreter and arguments&lt;/li>
&lt;li>Updating the command line arguments&lt;/li>
&lt;li>Recalling binary handler&lt;/li>
&lt;/ol>
&lt;h2 id="parsing-the-interpreter--arguments">Parsing the Interpreter &amp;amp; Arguments&lt;/h2>
&lt;p>At this point the function knows it&amp;rsquo;s a script, so now it has to extract the interpreter and any arguments out of the 128 byte buffer. I&amp;rsquo;ve commented each line of this processes below:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-c" data-lang="c">&lt;span style="color:#75715e">//Add a NUL to the end so the string is NUL terminated.
&lt;/span>&lt;span style="color:#75715e">&lt;/span>bprm&lt;span style="color:#f92672">-&amp;gt;&lt;/span>buf[BINPRM_BUF_SIZE &lt;span style="color:#f92672">-&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>] &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;\0&amp;#39;&lt;/span>;
&lt;span style="color:#75715e">//The end of the string is either:
&lt;/span>&lt;span style="color:#75715e">// a) A newline character, or
&lt;/span>&lt;span style="color:#75715e">// b) The end of the buffer
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#66d9ef">if&lt;/span> ((cp &lt;span style="color:#f92672">=&lt;/span> strchr(bprm&lt;span style="color:#f92672">-&amp;gt;&lt;/span>buf, &lt;span style="color:#e6db74">&amp;#39;\n&amp;#39;&lt;/span>)) &lt;span style="color:#f92672">==&lt;/span> NULL)
cp &lt;span style="color:#f92672">=&lt;/span> bprm&lt;span style="color:#f92672">-&amp;gt;&lt;/span>buf&lt;span style="color:#f92672">+&lt;/span>BINPRM_BUF_SIZE&lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#ae81ff">1&lt;/span>;
&lt;span style="color:#75715e">//For a) above, replaces the newline with a NUL.
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#75715e">//If it was b) above, it redundantly replaces a NUL
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#75715e">//with another NUL
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#f92672">*&lt;/span>cp &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;\0&amp;#39;&lt;/span>;
&lt;span style="color:#75715e">//Work our way backwards through the buffer
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#66d9ef">while&lt;/span> (cp &lt;span style="color:#f92672">&amp;gt;&lt;/span> bprm&lt;span style="color:#f92672">-&amp;gt;&lt;/span>buf) {
cp&lt;span style="color:#f92672">--&lt;/span>;
&lt;span style="color:#75715e">//If the character is whitespace, replace it with
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#75715e">//a NUL
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#66d9ef">if&lt;/span> ((&lt;span style="color:#f92672">*&lt;/span>cp &lt;span style="color:#f92672">==&lt;/span> &lt;span style="color:#e6db74">&amp;#39; &amp;#39;&lt;/span>) &lt;span style="color:#f92672">||&lt;/span> (&lt;span style="color:#f92672">*&lt;/span>cp &lt;span style="color:#f92672">==&lt;/span> &lt;span style="color:#e6db74">&amp;#39;\t&amp;#39;&lt;/span>))
&lt;span style="color:#f92672">*&lt;/span>cp &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;\0&amp;#39;&lt;/span>;
&lt;span style="color:#75715e">//Otherwise, we&amp;#39;ve found the end of the interpreter
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#75715e">//string
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#66d9ef">else&lt;/span>
&lt;span style="color:#66d9ef">break&lt;/span>;
}
&lt;span style="color:#75715e">//After the hashbang (the buf + 2), remove any whitespace
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#66d9ef">for&lt;/span> (cp &lt;span style="color:#f92672">=&lt;/span> bprm&lt;span style="color:#f92672">-&amp;gt;&lt;/span>buf&lt;span style="color:#f92672">+&lt;/span>&lt;span style="color:#ae81ff">2&lt;/span>; (&lt;span style="color:#f92672">*&lt;/span>cp &lt;span style="color:#f92672">==&lt;/span> &lt;span style="color:#e6db74">&amp;#39; &amp;#39;&lt;/span>) &lt;span style="color:#f92672">||&lt;/span> (&lt;span style="color:#f92672">*&lt;/span>cp &lt;span style="color:#f92672">==&lt;/span> &lt;span style="color:#e6db74">&amp;#39;\t&amp;#39;&lt;/span>); cp&lt;span style="color:#f92672">++&lt;/span>);
&lt;span style="color:#75715e">//If we hit a NUL, the line only contains a hashbang
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#75715e">//with no interpreter
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#66d9ef">if&lt;/span> (&lt;span style="color:#f92672">*&lt;/span>cp &lt;span style="color:#f92672">==&lt;/span> &lt;span style="color:#e6db74">&amp;#39;\0&amp;#39;&lt;/span>)
&lt;span style="color:#66d9ef">return&lt;/span> &lt;span style="color:#f92672">-&lt;/span>ENOEXEC; &lt;span style="color:#75715e">/* No interpreter name found */&lt;/span>
&lt;span style="color:#75715e">//i_name (and cp) points to the start of the interpreter string
&lt;/span>&lt;span style="color:#75715e">&lt;/span> i_name &lt;span style="color:#f92672">=&lt;/span> cp;
&lt;/code>&lt;/pre>&lt;/div>&lt;p>At the end of this process, &lt;code>i_name&lt;/code> points to the first character of a NUL terminated string containing the path to the interpreter and arguments, with whitespace before and after being removed.&lt;/p>
&lt;pre>&lt;code>+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+----+
| / | u | s | r | / | b | i | n | / | e | n | v | | p | e | r | l | \0 |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+----+
^
|
+------+
|i_name|
+------+
&lt;/code>&lt;/pre>&lt;h2 id="splitting-the-interpreter-and-arguments">Splitting the Interpreter and Arguments&lt;/h2>
&lt;p>The string now needs to be split into its components: the path to the interpreter, and any arguments to that interpreter:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-c" data-lang="c">i_arg &lt;span style="color:#f92672">=&lt;/span> NULL;
&lt;span style="color:#75715e">//cp still points to the start of the interpreter string,
&lt;/span>&lt;span style="color:#75715e">//exlcluding any whitespace.
&lt;/span>&lt;span style="color:#75715e">//
&lt;/span>&lt;span style="color:#75715e">//Move along the string until we either hit
&lt;/span>&lt;span style="color:#75715e">// a) A NUL character, or
&lt;/span>&lt;span style="color:#75715e">// b) A space or a tab
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">for&lt;/span> ( ; &lt;span style="color:#f92672">*&lt;/span>cp &lt;span style="color:#f92672">&amp;amp;&amp;amp;&lt;/span> (&lt;span style="color:#f92672">*&lt;/span>cp &lt;span style="color:#f92672">!=&lt;/span> &lt;span style="color:#e6db74">&amp;#39; &amp;#39;&lt;/span>) &lt;span style="color:#f92672">&amp;amp;&amp;amp;&lt;/span> (&lt;span style="color:#f92672">*&lt;/span>cp &lt;span style="color:#f92672">!=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;\t&amp;#39;&lt;/span>); cp&lt;span style="color:#f92672">++&lt;/span>)
&lt;span style="color:#75715e">/* nothing */&lt;/span> ;
&lt;span style="color:#75715e">//If there is whitespace, replace it with a NUL character
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">while&lt;/span> ((&lt;span style="color:#f92672">*&lt;/span>cp &lt;span style="color:#f92672">==&lt;/span> &lt;span style="color:#e6db74">&amp;#39; &amp;#39;&lt;/span>) &lt;span style="color:#f92672">||&lt;/span> (&lt;span style="color:#f92672">*&lt;/span>cp &lt;span style="color:#f92672">==&lt;/span> &lt;span style="color:#e6db74">&amp;#39;\t&amp;#39;&lt;/span>))
&lt;span style="color:#f92672">*&lt;/span>cp&lt;span style="color:#f92672">++&lt;/span> &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;\0&amp;#39;&lt;/span>;
&lt;span style="color:#75715e">//If there are bytes after the whitespace, these become
&lt;/span>&lt;span style="color:#75715e">//the arguments to the interpreter.
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">if&lt;/span> (&lt;span style="color:#f92672">*&lt;/span>cp)
i_arg &lt;span style="color:#f92672">=&lt;/span> cp;
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Again, taking our example script, the pointers now point to the following pieces of memory:&lt;/p>
&lt;pre>&lt;code>+---+---+---+---+---+---+---+---+---+---+---+---+----+---+---+---+---+----+
| / | u | s | r | / | b | i | n | / | e | n | v | \0 | p | e | r | l | \0 |
+---+---+---+---+---+---+---+---+---+---+---+---+----+---+---+---+---+----+
^ ^
| |
+------+ +-----+
|i_name| |i_arg|
+------+ +-----+
&lt;/code>&lt;/pre>&lt;p>One of the main implications of this code is that you cannot have whitespace in the interpreter path, as anything after the whitespace is considered arguments to the interpreter.&lt;/p>
&lt;h2 id="updating-the-arguments">Updating the Arguments&lt;/h2>
&lt;p>The arguments and argument counts now need to be updated. If we ran our script as &lt;code>./foo.pl foo_arg&lt;/code>, and the hashbang line was &lt;code>#!/usr/bin/perl perl_arg&lt;/code>, the new command line arguments need to be &lt;code>./usr/bin/perl perl_arg ./foo.pl foo_arg&lt;/code>. Because of the way the stack is laid out, this is done in reverse order.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-c" data-lang="c">&lt;span style="color:#75715e">//The current argv[0] (the filename of our script) is removed
&lt;/span>&lt;span style="color:#75715e">//from the temporary stack
&lt;/span>&lt;span style="color:#75715e">&lt;/span>retval &lt;span style="color:#f92672">=&lt;/span> remove_arg_zero(bprm);
...
&lt;span style="color:#75715e">//Add in the filename of the script being executed.
&lt;/span>&lt;span style="color:#75715e">&lt;/span>retval &lt;span style="color:#f92672">=&lt;/span> copy_strings_kernel(&lt;span style="color:#ae81ff">1&lt;/span>, &lt;span style="color:#f92672">&amp;amp;&lt;/span>bprm&lt;span style="color:#f92672">-&amp;gt;&lt;/span>interp, bprm);
...
bprm&lt;span style="color:#f92672">-&amp;gt;&lt;/span>argc&lt;span style="color:#f92672">++&lt;/span>;
&lt;span style="color:#66d9ef">if&lt;/span> (i_arg) {
&lt;span style="color:#75715e">//If the interpreter line has arguments, add these in to the stack.
&lt;/span>&lt;span style="color:#75715e">&lt;/span> retval &lt;span style="color:#f92672">=&lt;/span> copy_strings_kernel(&lt;span style="color:#ae81ff">1&lt;/span>, &lt;span style="color:#f92672">&amp;amp;&lt;/span>i_arg, bprm);
...
bprm&lt;span style="color:#f92672">-&amp;gt;&lt;/span>argc&lt;span style="color:#f92672">++&lt;/span>;
}
&lt;span style="color:#75715e">//Finallly, add the interpreter, which becomes arg[0].
&lt;/span>&lt;span style="color:#75715e">&lt;/span>retval &lt;span style="color:#f92672">=&lt;/span> copy_strings_kernel(&lt;span style="color:#ae81ff">1&lt;/span>, &lt;span style="color:#f92672">&amp;amp;&lt;/span>i_name, bprm);
...
bprm&lt;span style="color:#f92672">-&amp;gt;&lt;/span>argc&lt;span style="color:#f92672">++&lt;/span>;
&lt;span style="color:#75715e">//Update the &amp;#39;interp` bprm member, which will now
&lt;/span>&lt;span style="color:#75715e">//be difference to the &amp;#39;filename&amp;#39; member.
&lt;/span>&lt;span style="color:#75715e">&lt;/span>retval &lt;span style="color:#f92672">=&lt;/span> bprm_change_interp(i_name, bprm);
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Now that the the interpreter has been parsed and the bprm structure updated, the interpreter is opened as a file, and the &lt;code>search_binary_hander()&lt;/code> is called again. Except this time it will be searching for a binary handler for our interpreter.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-c" data-lang="c">file &lt;span style="color:#f92672">=&lt;/span> open_exec(i_name);
&lt;span style="color:#66d9ef">if&lt;/span> (IS_ERR(file))
&lt;span style="color:#66d9ef">return&lt;/span> PTR_ERR(file);
bprm&lt;span style="color:#f92672">-&amp;gt;&lt;/span>file &lt;span style="color:#f92672">=&lt;/span> file;
retval &lt;span style="color:#f92672">=&lt;/span> prepare_binprm(bprm);
&lt;span style="color:#66d9ef">if&lt;/span> (retval &lt;span style="color:#f92672">&amp;lt;&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span>)
&lt;span style="color:#66d9ef">return&lt;/span> retval;
&lt;span style="color:#66d9ef">return&lt;/span> &lt;span style="color:#a6e22e">search_binary_handler&lt;/span>(bprm);
&lt;/code>&lt;/pre>&lt;/div>&lt;p>The interpreter is of this type:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sh" data-lang="sh">file /usr/bin/perl
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>/usr/bin/perl: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=e865d791bb4b89f4ab5e7ec1217e38ff6c31f3ed, stripped
&lt;/code>&lt;/pre>&lt;p>Thus the ELF binary handler will be called, doing what it needs to do to load the binary and add it to the kernel scheduler, eventually running the process.&lt;/p>
&lt;h1 id="summary">Summary&lt;/h1>
&lt;p>We started this article out with a question: how is an interpreter called when used in a hashbang line of script. We used some tools to determine whether it was performed in userspace, but found that it was the kernel that performed this task.&lt;/p>
&lt;p>We went through the kernel, starting at the &lt;code>execve()&lt;/code> system call, working our way down to the binary handlers. We then went through the &amp;lsquo;script&amp;rsquo; binary handler, which matches files that have &amp;lsquo;#!&amp;rsquo; as their first two bytes. We could see how this handler parsed the interpreter line, extracting the interpreter path and optional arguments, and updating the binary to be called by the kernel.&lt;/p>
&lt;p>This is an example of an elegant and generalised solution by the Linux kernel.&lt;/p>
&lt;h1 id="appendix---system-calls">Appendix - System Calls&lt;/h1>
&lt;p>When we started delving into the kernel, we skipped over the &lt;code>syscall&lt;/code> instruction and the system call handler in the kernel; this appendix summarises what happens between. There&amp;rsquo;s a number of different variables that change the code path (page table isolation, slow and fast paths), so it&amp;rsquo;s a simplification.&lt;/p>
&lt;p>As with the rest of this article, we only consider an x86_64 processor architecture, and I&amp;rsquo;m also going to ignore the &lt;em>page table isolation&lt;/em> feature introduced to mitigate against the &lt;a href="https://meltdownattack.com/">Meltdown&lt;/a> vulnerability.&lt;/p>
&lt;ul>
&lt;li>The &lt;code>syscall&lt;/code> instruction:
&lt;ul>
&lt;li>Saves the address of the following instruction to the &lt;code>rcx&lt;/code> register&lt;/li>
&lt;li>Loads a new instruction pointer from the &lt;code>IA32_LSTAR&lt;/code> model specific register.&lt;/li>
&lt;li>Jumps to the new instruction at a ring 0 privilege level.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>The &lt;code>IA32_LSTAR&lt;/code> register holds the address if &lt;a href="https://elixir.bootlin.com/linux/v4.15/source/arch/x86/entry/entry_64.S#L206">entry_SYSCALL_64&lt;/a>, which is our system call handler.
&lt;ul>
&lt;li>This is set (per-CPU) at boot time in &lt;a href="https://elixir.bootlin.com/linux/v4.15/source/arch/x86/kernel/cpu/common.c#L1373">syscall_init()&lt;/a>.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>The &lt;code>entry_SYSCALL_64&lt;/code> handler performs re-organisation required to transition from userspace to kernel space.
&lt;ul>
&lt;li>Main task is to push all the current registers on to some new stack space.&lt;/li>
&lt;li>There&amp;rsquo;s loads of other things, but let&amp;rsquo;s consider them out of scope for this summarisation.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>The system call is then called using its number (in the &lt;code>rax&lt;/code> register) as an index into the &lt;a href="https://elixir.bootlin.com/linux/v4.15/source/arch/x86/entry/syscall_64.c#L21">sys_call_table&lt;/a> array.
&lt;ul>
&lt;li>This is an array of function pointers to each of the system calls.&lt;/li>
&lt;li>The file in the &lt;code>#include &amp;lt;asm/syscalls_64.h&amp;gt;&lt;/code> line is generated dynamically.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>It&amp;rsquo;s generated from the &lt;a href="https://elixir.bootlin.com/linux/v4.15/source/arch/x86/entry/syscalls/syscall_64.tbl">syscall table file&lt;/a>.&lt;/li>
&lt;li>This is converted into a header via a simple &lt;a href="https://elixir.bootlin.com/linux/v4.15/source/arch/x86/entry/syscalls/syscallhdr.sh">shell script&lt;/a>.&lt;/li>
&lt;/ul></description></item><item><title>A Bit on the Nose</title><link>https://clt.blog.foletta.net/post/a-bit-on-the-nose/</link><pubDate>Sun, 07 Mar 2021 00:00:00 +0000</pubDate><guid>https://clt.blog.foletta.net/post/a-bit-on-the-nose/</guid><description>&lt;p>I&amp;rsquo;ve never been particularly interested in horse racing, but I married into a family that loves it. Each in-law has their own ideas and combinations of factors that lead them to bet on a particular horse. It could be it form, barrier position, track condition, trainer, jockey, and many others.&lt;/p>
&lt;p>After being drawn into conversations about their preferred selection methods, I wanted come at the problem backed with data. I must admit I had an initial feeling of arrogance, thinking &amp;ldquo;of course I can do this better&amp;rdquo;. In fact I&amp;rsquo;ve seen this in many places where &amp;lsquo;data scientists&amp;rsquo; stroll into fields of enquiry armed with data and a swag of models, but lacking an understanding of the problem space. Poor assumptions abound, and incorrect conclusions are almost certainly reached.&lt;/p>
&lt;p>I was determined not to fall into the same traps, and after quashing my misplaced sense of superiority, I started to think about how to approach the problem at hand. Rather than diving straight into prediction - models akimbo - I thought the best place to start would be to create naive baselines. This would give me something to compare the performance of any subsequent models against.&lt;/p>
&lt;p>In this article I will look at two baselines. The first is to pick a random horse in each race, which will provide us with a lower bound for model predictive accuracy. The second is to pick the favourite in each race. The favourite has many of the factors that we would be using in the model already built in via the consensus of the bettors: form, barrier position, trainer, jockey, etc. Any model we create needs to approach the accuracy of this method.&lt;/p>
&lt;p>Simply put, we want to answer the following questions:&lt;/p>
&lt;blockquote>
&lt;p>How accurate are our &amp;lsquo;random&amp;rsquo; and &amp;lsquo;favourite&amp;rsquo; methods at picking the winning horse?&lt;/p>
&lt;/blockquote>
&lt;blockquote>
&lt;p>What are our long term returns using the &amp;lsquo;random&amp;rsquo; and &amp;lsquo;favourite&amp;rsquo; methods?&lt;/p>
&lt;/blockquote>
&lt;h1 id="data-information--aquisition">Data Information &amp;amp; Aquisition&lt;/h1>
&lt;p>The data was acquired by using &lt;a href="https://rvest.tidyverse.org/">rvest&lt;/a> to scrape a website that contained historical information on horse races. I was able to iterate across each race, pulling out specific variables using CSS selectors and XPaths. The dataset is for my own personal use, and I have encrypted the data that us used in this article.&lt;/p>
&lt;p>The dataset contains information on around 180,000 horse races over the period from 2011 to 2020. It&amp;rsquo;s in a tidy format, with each row containing information on each horse in each race. It includes, but isn&amp;rsquo;t limited to, the name and state that the track, the date of the race, the name of the horse, jockey and trainer, the weight the horse is carrying, race length, duration, barrier position. Here&amp;rsquo;s an random sample from the dataset with some of the key variables selected:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">hr_results &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">select&lt;/span>(
race_id, state, track,
horse.name, jockey, odds.sp,
position, barrier, weight
) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">slice_sample&lt;/span>(n &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">10&lt;/span>) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">gt&lt;/span>()
&lt;/code>&lt;/pre>&lt;/div>&lt;style>html {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Helvetica Neue', 'Fira Sans', 'Droid Sans', Arial, sans-serif;
}
#hfjgovmpcs .gt_table {
display: table;
border-collapse: collapse;
margin-left: auto;
margin-right: auto;
color: #333333;
font-size: 16px;
font-weight: normal;
font-style: normal;
background-color: #FFFFFF;
width: auto;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #A8A8A8;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #A8A8A8;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
}
#hfjgovmpcs .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
#hfjgovmpcs .gt_title {
color: #333333;
font-size: 125%;
font-weight: initial;
padding-top: 4px;
padding-bottom: 4px;
border-bottom-color: #FFFFFF;
border-bottom-width: 0;
}
#hfjgovmpcs .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
padding-top: 0;
padding-bottom: 4px;
border-top-color: #FFFFFF;
border-top-width: 0;
}
#hfjgovmpcs .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
#hfjgovmpcs .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
#hfjgovmpcs .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
padding-left: 5px;
padding-right: 5px;
overflow-x: hidden;
}
#hfjgovmpcs .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
padding-top: 0;
padding-bottom: 0;
padding-left: 4px;
padding-right: 4px;
}
#hfjgovmpcs .gt_column_spanner_outer:first-child {
padding-left: 0;
}
#hfjgovmpcs .gt_column_spanner_outer:last-child {
padding-right: 0;
}
#hfjgovmpcs .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
overflow-x: hidden;
display: inline-block;
width: 100%;
}
#hfjgovmpcs .gt_group_heading {
padding: 8px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
}
#hfjgovmpcs .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: middle;
}
#hfjgovmpcs .gt_from_md > :first-child {
margin-top: 0;
}
#hfjgovmpcs .gt_from_md > :last-child {
margin-bottom: 0;
}
#hfjgovmpcs .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
margin: 10px;
border-top-style: solid;
border-top-width: 1px;
border-top-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
overflow-x: hidden;
}
#hfjgovmpcs .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 12px;
}
#hfjgovmpcs .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
#hfjgovmpcs .gt_first_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
}
#hfjgovmpcs .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
#hfjgovmpcs .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: double;
border-top-width: 6px;
border-top-color: #D3D3D3;
}
#hfjgovmpcs .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
#hfjgovmpcs .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
#hfjgovmpcs .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
#hfjgovmpcs .gt_footnote {
margin: 0px;
font-size: 90%;
padding: 4px;
}
#hfjgovmpcs .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
#hfjgovmpcs .gt_sourcenote {
font-size: 90%;
padding: 4px;
}
#hfjgovmpcs .gt_left {
text-align: left;
}
#hfjgovmpcs .gt_center {
text-align: center;
}
#hfjgovmpcs .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
#hfjgovmpcs .gt_font_normal {
font-weight: normal;
}
#hfjgovmpcs .gt_font_bold {
font-weight: bold;
}
#hfjgovmpcs .gt_font_italic {
font-style: italic;
}
#hfjgovmpcs .gt_super {
font-size: 65%;
}
#hfjgovmpcs .gt_footnote_marks {
font-style: italic;
font-size: 65%;
}
&lt;/style>
&lt;div id="hfjgovmpcs" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;">&lt;table class="gt_table">
&lt;thead class="gt_col_headings">
&lt;tr>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">race_id&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1">state&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1">track&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1">horse.name&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1">jockey&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">odds.sp&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1">position&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">barrier&lt;/th>
&lt;th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">weight&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody class="gt_table_body">
&lt;tr>
&lt;td class="gt_row gt_right">51226&lt;/td>
&lt;td class="gt_row gt_left">QLD&lt;/td>
&lt;td class="gt_row gt_left">Doomben&lt;/td>
&lt;td class="gt_row gt_left">Daisy Duke&lt;/td>
&lt;td class="gt_row gt_left">Robbie Fradd&lt;/td>
&lt;td class="gt_row gt_right">6.00&lt;/td>
&lt;td class="gt_row gt_left">1&lt;/td>
&lt;td class="gt_row gt_right">3&lt;/td>
&lt;td class="gt_row gt_right">58.0&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="gt_row gt_right">134689&lt;/td>
&lt;td class="gt_row gt_left">VIC&lt;/td>
&lt;td class="gt_row gt_left">Pakenham&lt;/td>
&lt;td class="gt_row gt_left">Nautilus&lt;/td>
&lt;td class="gt_row gt_left">James Winks&lt;/td>
&lt;td class="gt_row gt_right">6.00&lt;/td>
&lt;td class="gt_row gt_left">4&lt;/td>
&lt;td class="gt_row gt_right">1&lt;/td>
&lt;td class="gt_row gt_right">54.5&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="gt_row gt_right">24981&lt;/td>
&lt;td class="gt_row gt_left">WA&lt;/td>
&lt;td class="gt_row gt_left">Bunbury&lt;/td>
&lt;td class="gt_row gt_left">RABBIT NAGINA&lt;/td>
&lt;td class="gt_row gt_left">Alan Kennedy&lt;/td>
&lt;td class="gt_row gt_right">31.00&lt;/td>
&lt;td class="gt_row gt_left">6&lt;/td>
&lt;td class="gt_row gt_right">10&lt;/td>
&lt;td class="gt_row gt_right">56.5&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="gt_row gt_right">78601&lt;/td>
&lt;td class="gt_row gt_left">NSW&lt;/td>
&lt;td class="gt_row gt_left">Grenfell&lt;/td>
&lt;td class="gt_row gt_left">Gaze Beyond&lt;/td>
&lt;td class="gt_row gt_left">Ms Ashleigh Stanley&lt;/td>
&lt;td class="gt_row gt_right">6.00&lt;/td>
&lt;td class="gt_row gt_left">4&lt;/td>
&lt;td class="gt_row gt_right">9&lt;/td>
&lt;td class="gt_row gt_right">53.0&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="gt_row gt_right">44267&lt;/td>
&lt;td class="gt_row gt_left">NSW&lt;/td>
&lt;td class="gt_row gt_left">Deniliquin&lt;/td>
&lt;td class="gt_row gt_left">Sunpoint&lt;/td>
&lt;td class="gt_row gt_left">Ms A Beer&lt;/td>
&lt;td class="gt_row gt_right">8.00&lt;/td>
&lt;td class="gt_row gt_left">6&lt;/td>
&lt;td class="gt_row gt_right">4&lt;/td>
&lt;td class="gt_row gt_right">57.0&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="gt_row gt_right">129895&lt;/td>
&lt;td class="gt_row gt_left">WA&lt;/td>
&lt;td class="gt_row gt_left">Northam&lt;/td>
&lt;td class="gt_row gt_left">Ram Jam&lt;/td>
&lt;td class="gt_row gt_left">Ben Kennedy&lt;/td>
&lt;td class="gt_row gt_right">2.25&lt;/td>
&lt;td class="gt_row gt_left">1&lt;/td>
&lt;td class="gt_row gt_right">4&lt;/td>
&lt;td class="gt_row gt_right">58.5&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="gt_row gt_right">101574&lt;/td>
&lt;td class="gt_row gt_left">TAS&lt;/td>
&lt;td class="gt_row gt_left">Launceston&lt;/td>
&lt;td class="gt_row gt_left">Gee Gee Double Hot&lt;/td>
&lt;td class="gt_row gt_left">Scarlet So&lt;/td>
&lt;td class="gt_row gt_right">5.00&lt;/td>
&lt;td class="gt_row gt_left">1&lt;/td>
&lt;td class="gt_row gt_right">1&lt;/td>
&lt;td class="gt_row gt_right">52.0&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="gt_row gt_right">134173&lt;/td>
&lt;td class="gt_row gt_left">VIC&lt;/td>
&lt;td class="gt_row gt_left">Pakenham&lt;/td>
&lt;td class="gt_row gt_left">Miss Mo&lt;/td>
&lt;td class="gt_row gt_left">Ben E Thompson&lt;/td>
&lt;td class="gt_row gt_right">8.50&lt;/td>
&lt;td class="gt_row gt_left">7&lt;/td>
&lt;td class="gt_row gt_right">3&lt;/td>
&lt;td class="gt_row gt_right">58.0&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="gt_row gt_right">69517&lt;/td>
&lt;td class="gt_row gt_left">QLD&lt;/td>
&lt;td class="gt_row gt_left">Gold Coast&lt;/td>
&lt;td class="gt_row gt_left">Liberty Island&lt;/td>
&lt;td class="gt_row gt_left">Jag Guthmann-Chester&lt;/td>
&lt;td class="gt_row gt_right">8.00&lt;/td>
&lt;td class="gt_row gt_left">5&lt;/td>
&lt;td class="gt_row gt_right">12&lt;/td>
&lt;td class="gt_row gt_right">57.0&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="gt_row gt_right">140862&lt;/td>
&lt;td class="gt_row gt_left">SA&lt;/td>
&lt;td class="gt_row gt_left">Port Augusta&lt;/td>
&lt;td class="gt_row gt_left">Gold Maestro&lt;/td>
&lt;td class="gt_row gt_left">Jeffrey Maund&lt;/td>
&lt;td class="gt_row gt_right">31.00&lt;/td>
&lt;td class="gt_row gt_left">1&lt;/td>
&lt;td class="gt_row gt_right">5&lt;/td>
&lt;td class="gt_row gt_right">54.0&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>&lt;/div>
&lt;p>We won&amp;rsquo;t use most of the variables in the data set, only a select few:&lt;/p>
&lt;ul>
&lt;li>&lt;em>race_id&lt;/em> - a unique identifier for each race. There are multiple rows with the same &lt;em>race_id&lt;/em>, each representing a horse that ran in that race.&lt;/li>
&lt;li>&lt;em>odds.sp&lt;/em> - the &amp;lsquo;starting price&amp;rsquo;, which is are the &amp;ldquo;odds prevailing on a particular horse in the on-course fixed-odds betting market at the time a race begins.&amp;quot;.&lt;/li>
&lt;li>&lt;em>position&lt;/em> - the finishing position of the horse.&lt;/li>
&lt;/ul>
&lt;p>I&amp;rsquo;ve omitted the code to load the data, however the full source of this article (and the entire website) is available on &lt;a href="https://github.com/gregfoletta/articles.foletta.org">github&lt;/a>. The data is contained in the variable &lt;code>hr_results&lt;/code>.&lt;/p>
&lt;h1 id="exploration">Exploration&lt;/h1>
&lt;p>Let&amp;rsquo;s take a look at the dataset from a few different perspectives to give us some context. First up we take a look at the number of races per month per state. We can clearly see the yearly cyclic nature, with the rise into the spring racing carnivals and a drop off over winter.&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/a-bit-on-the-nose/index_files/figure-html/unnamed-chunk-4-1.png" width="672" />&lt;/p>
&lt;p>Next we take a look at the top 10 winning horses and trainers over this period:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/a-bit-on-the-nose/index_files/figure-html/unnamed-chunk-5-1.png" width="672" />&lt;/p>
&lt;p>Which tracks have run the most races over this period?&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/a-bit-on-the-nose/index_files/figure-html/unnamed-chunk-6-1.png" width="672" />&lt;/p>
&lt;p>Finally, what is the distribution of the starting price odds? This distribution has a very long tail, so I&amp;rsquo;ve removed the long odds above 100 to provide a better view of the most common values. What&amp;rsquo;s interesting is the bias towards odds with round numbers after the 20 mark.&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/a-bit-on-the-nose/index_files/figure-html/unnamed-chunk-7-1.png" width="672" />&lt;/p>
&lt;h1 id="data-sampling">Data Sampling&lt;/h1>
&lt;p>With a high level handle on the data we&amp;rsquo;re working with, let&amp;rsquo;s move on to answering the questions. The process is:&lt;/p>
&lt;ol>
&lt;li>Take a sample of races across the time period.
&lt;ul>
&lt;li>We will use 0.5% or ~800 races.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Place a dollar &amp;lsquo;bet&amp;rsquo; on a horse in each race, determined by one of our methods.&lt;/li>
&lt;li>Calculate our return (payout - stake).&lt;/li>
&lt;li>Calculate our cumulative return.&lt;/li>
&lt;li>Calculate our accuracy across all the races.&lt;/li>
&lt;li>Calculate our return per race.&lt;/li>
&lt;li>Return to 1. and repeat.&lt;/li>
&lt;/ol>
&lt;p>After the process is complete, we can look at the mean and distributions for the return per race and accuracy metrics. This process is similar to a bootstrap except within the sample we&amp;rsquo;re performing it &lt;em>without&lt;/em> replacement instead of &lt;em>with&lt;/em> replacement.&lt;/p>
&lt;p>We select only the variables that we need, so we&amp;rsquo;re not moving huge amounts of unused data between to our worker processes. (Before I realised I should be pruning the data, I was spinning up large AWS instances with 128Gb of memory to perform the sampling. After the pruning I could run it on my laptop with 16GB of memory!) The data is nested based on its race ID, allowing us to sample per race rather than per horse.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># Nest per race&lt;/span>
hr_results &lt;span style="color:#f92672">&amp;lt;-&lt;/span> hr_results &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">select&lt;/span>(race_id, position, odds.sp) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">group_by&lt;/span>(race_id) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">nest&lt;/span>()
&lt;span style="color:#a6e22e">head&lt;/span>(hr_results)
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>## # A tibble: 6 x 2
## # Groups: race_id [6]
## race_id data
## &amp;lt;dbl&amp;gt; &amp;lt;list&amp;gt;
## 1 25488 &amp;lt;tibble [6 × 2]&amp;gt;
## 2 25489 &amp;lt;tibble [5 × 2]&amp;gt;
## 3 25490 &amp;lt;tibble [6 × 2]&amp;gt;
## 4 25491 &amp;lt;tibble [12 × 2]&amp;gt;
## 5 25492 &amp;lt;tibble [14 × 2]&amp;gt;
## 6 25493 &amp;lt;tibble [9 × 2]&amp;gt;
&lt;/code>&lt;/pre>&lt;p>The the &lt;code>mc_cv()&lt;/code> (Monte-Carlo cross validation) function from the &lt;a href="https://rsample.tidymodels.org/">rsample&lt;/a> package is used to create our sampled data sets. We&amp;rsquo;re not actually performing the cross-validation part, only using the training set that comes back from the function and throwing away the test set.&lt;/p>
&lt;p>The worker function &lt;code>mc_sample()&lt;/code> is created to be passed to &lt;code>future_map()&lt;/code> so we can spread the sampling work across multiple cores.&lt;/p>
&lt;p>We generate 2000 samples of .5% of the total races in the dataset, or around 800 races per sample. The returned results are unnested, returning us back to our original tidy format, with each sample identified by the &lt;em>sample_id&lt;/em> variable:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># Sampling function that creates a Monte-Carlo CV set&lt;/span>
&lt;span style="color:#75715e"># and returns the analysis portion.&lt;/span>
mc_sample &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">function&lt;/span>(data, times, prop) {
data &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">mc_cv&lt;/span>(times &lt;span style="color:#f92672">=&lt;/span> times, prop &lt;span style="color:#f92672">=&lt;/span> prop) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(analysis &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">map&lt;/span>(splits, &lt;span style="color:#f92672">~&lt;/span>&lt;span style="color:#a6e22e">analysis&lt;/span>(.x))) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">select&lt;/span>(&lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#a6e22e">c&lt;/span>(id, splits))
}
&lt;span style="color:#75715e"># Set up out workers&lt;/span>
&lt;span style="color:#a6e22e">plan&lt;/span>(multisession, workers &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">availableCores&lt;/span>() &lt;span style="color:#f92672">-&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>)
&lt;span style="color:#75715e"># Parallel sampling&lt;/span>
number_samples &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#ae81ff">2000&lt;/span>
hr_mccv &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">future_map&lt;/span>(
&lt;span style="color:#ae81ff">1&lt;/span>&lt;span style="color:#f92672">:&lt;/span>number_samples,
&lt;span style="color:#f92672">~&lt;/span>{ &lt;span style="color:#a6e22e">mc_sample&lt;/span>(hr_results, times &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>, prop &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">.005&lt;/span>) },
.options &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">furrr_options&lt;/span>(seed &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#66d9ef">TRUE&lt;/span>)
)
&lt;span style="color:#75715e"># Switch plans to close workers and release memory&lt;/span>
&lt;span style="color:#a6e22e">plan&lt;/span>(sequential)
&lt;span style="color:#75715e"># Bind samples together and unnest&lt;/span>
hr_mccv &lt;span style="color:#f92672">&amp;lt;-&lt;/span> hr_mccv &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">bind_rows&lt;/span>() &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(sample_id &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>&lt;span style="color:#f92672">:&lt;/span>&lt;span style="color:#a6e22e">n&lt;/span>()) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">unnest&lt;/span>(cols &lt;span style="color:#f92672">=&lt;/span> analysis) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">unnest&lt;/span>(cols &lt;span style="color:#f92672">=&lt;/span> data)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>A &lt;code>bet_returns()&lt;/code> function is created which places a bet (default $1) for the win on each horse in the dataset it&amp;rsquo;s provided. It determines the return based on the starting price odds. The data set uses decimal (also known as continental) odds, so if we placed a $1 bet on a horse with odds of 3.0 and the horse wins, our &lt;em>payout&lt;/em> is $3, but our &lt;em>return&lt;/em> is $2 (payout - $1). If the horse doesn&amp;rsquo;t win, our payout is $0 and our return is -$1.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># Places a bet for the win on each horse and calculates the return,&lt;/span>
&lt;span style="color:#75715e"># the cumulative return, and the cumulative return per race.&lt;/span>
bet_returns &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">function&lt;/span>(data, bet &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>) {
data &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(
bet_return &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">if_else&lt;/span>(
position &lt;span style="color:#f92672">==&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>,
(bet &lt;span style="color:#f92672">*&lt;/span> odds.sp) &lt;span style="color:#f92672">-&lt;/span> bet,
&lt;span style="color:#f92672">-&lt;/span>bet
)
) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">group_by&lt;/span>(sample_id) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(
sample_race_index &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>&lt;span style="color:#f92672">:&lt;/span>&lt;span style="color:#a6e22e">n&lt;/span>(),
cumulative_return &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">cumsum&lt;/span>(bet_return),
cumulative_rpr &lt;span style="color:#f92672">=&lt;/span> cumulative_return &lt;span style="color:#f92672">/&lt;/span> sample_race_index
) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">ungroup&lt;/span>()
}
&lt;/code>&lt;/pre>&lt;/div>&lt;h1 id="approach-1-random-selection">Approach 1: Random Selection&lt;/h1>
&lt;p>The first approach to take is to bet on a random horse per race.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># Select a random horse from each race where there are odds available&lt;/span>
hr_random &lt;span style="color:#f92672">&amp;lt;-&lt;/span> hr_mccv &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">drop_na&lt;/span>(odds.sp) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">group_by&lt;/span>(sample_id, race_id) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">slice_sample&lt;/span>(n &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">ungroup&lt;/span>()
&lt;span style="color:#75715e"># Place our bets&lt;/span>
hr_random &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">bet_returns&lt;/span>(hr_random)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Let&amp;rsquo;s first calculate the accuracy per sample, and view this as a histogram. The solid line is the mean, and the dashed lines are the 2.5% and 97.5% quantiles, showing the middle 95% range of the accuracy.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">hr_random_accuracy &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
hr_random &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(win &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">if_else&lt;/span>(position &lt;span style="color:#f92672">==&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>, &lt;span style="color:#ae81ff">1&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>)) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">group_by&lt;/span>(sample_id) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">summarise&lt;/span>(accuracy &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">mean&lt;/span>(win))
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;img src="https://clt.blog.foletta.net/post/a-bit-on-the-nose/index_files/figure-html/unnamed-chunk-13-1.png" width="672" />&lt;/p>
&lt;p>The random method gives us a mean accuracy of 11%, with 95% range between 9.1% and 13.5%. That&amp;rsquo;s about a 1 in 9 chance of picking the winning horse. At first I thought this was a little low, as the average number of horses in a race was about 6. I naively assumed that the random method would give us a 1 in 6 chance of picking the winnow, or 17% accuracy level. But this assumption assumes a uniform probability of winning for each horse, which of course is not correct.&lt;/p>
&lt;p>Accuracy is one thing, but what about our returns? Let&amp;rsquo;s take a look at our cumulative returns over time. It&amp;rsquo;s difficult to graph the entire 2000 samples as it becomes one big blob on the graph, so we look at the first 40 samples which gives us a reasonable representation:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/a-bit-on-the-nose/index_files/figure-html/unnamed-chunk-15-1.png" width="672" />&lt;/p>
&lt;p>The result is a general trend downwards. We see some big jumps where our chosen horse is the long shot that came home, and some of our samples manage to pull themselves back into the black for periods of time. But they quickly regress trend back into the red.&lt;/p>
&lt;p>The number of races may vary slightly per sample, so instead of looking at the cumulative return, let&amp;rsquo;s look at the returns per race, i.e. (cumulative return / number of races).&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/a-bit-on-the-nose/index_files/figure-html/unnamed-chunk-16-1.png" width="672" />&lt;/p>
&lt;p>In the long run our average return per race is -$0.30, and 95% of our returns are within the range of -$0.49 to -$0.04. As we&amp;rsquo;ve used a dollar bet, this translates nicely to a percentage. What we can say is that in the long run we&amp;rsquo;re on average losing 30% of our stake each time we use this method of betting.&lt;/p>
&lt;h1 id="approach-2---favourite">Approach 2 - Favourite&lt;/h1>
&lt;p>The second approach to take is to bet on the favourite in each race. We rank each horse in each race using the &lt;code>order()&lt;/code> function, and extract the horse with a rank of 1. For races where there are two equal favourites, we pick one of those horses at random.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># Favourite horse from each race&lt;/span>
hr_favourite &lt;span style="color:#f92672">&amp;lt;-&lt;/span> hr_mccv &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">drop_na&lt;/span>(odds.sp) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">group_by&lt;/span>(sample_id, race_id) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(odds.rank &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">order&lt;/span>(odds.sp)) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">slice_min&lt;/span>(odds.rank, with_ties &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#66d9ef">FALSE&lt;/span>, n &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">ungroup&lt;/span>()
&lt;span style="color:#75715e"># Place out bets&lt;/span>
hr_favourite &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">bet_returns&lt;/span>(hr_favourite)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Let&amp;rsquo;s again take a look at the accuracy of this approach, viewed as a histogram of accuracy per sample.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># Calculate the accuracy&lt;/span>
hr_favourite_accuracy &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
hr_favourite &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(win &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">if_else&lt;/span>(position &lt;span style="color:#f92672">==&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>, &lt;span style="color:#ae81ff">1&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>)) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">group_by&lt;/span>(sample_id) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">summarise&lt;/span>(accuracy &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">mean&lt;/span>(win))
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;img src="https://clt.blog.foletta.net/post/a-bit-on-the-nose/index_files/figure-html/unnamed-chunk-20-1.png" width="672" />&lt;/p>
&lt;p>This is looking much better - we&amp;rsquo;ve got a mean accuracy across all of the samples of 35%, with 95% of our accuracy in the range of 32.0% - 38.3%. These accuracy percentages look pretty good, and my gut feel is that they would be pretty difficult to approach with any sort of predictive model. Picking the favourite is around 3 times better than picking a random horse.&lt;/p>
&lt;p>What do our returns over time look like? Again we take the first 40 samples and graph the cumulative return.&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/a-bit-on-the-nose/index_files/figure-html/unnamed-chunk-22-1.png" width="672" />&lt;/p>
&lt;p>There is still a general trend downwards, however it&amp;rsquo;s certainly not as pronounced as the random method. There are longer periods of time where we&amp;rsquo;re trending sideways, and some of our samples even manage to eke out a profit.&lt;/p>
&lt;p>Taking a look again at the distributions of our returns per race:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/a-bit-on-the-nose/index_files/figure-html/unnamed-chunk-23-1.png" width="672" />&lt;/p>
&lt;p>Picking the favourite is much better than picking a random horse but it&amp;rsquo;s certainly no slam dunk. The long run average return per race is still negative at -$0.05. The 95% of returns per race are in the range of -$0.15 to $0.04.&lt;/p>
&lt;h1 id="conclusion">Conclusion&lt;/h1>
&lt;p>In this article we baselined two different approaches to betting on horse races: picking a random horse, and picking the favourite. Our aim was determine the mean accuracy and mean returns per race for each of the approaches.&lt;/p>
&lt;p>We found the accuracy of picking a random horse is 11% and the mean returns per race for a dollar bet are -$0.30. You&amp;rsquo;re losing thirty cents on the dollar per bet.&lt;/p>
&lt;p>Betting of the favourite is unsurprisingly much better, with a mean accuracy of 35% and mean returns per race for a dollar bet being -$0.05, or a loss a five cents on the dollar. I&amp;rsquo;m impressed with the bookies ability to get so close to parity.&lt;/p>
&lt;p>What we don&amp;rsquo;t take into account here is the utility, or enjoyment, that is gained from the bet. If you think cost of the enjoyment you receive betting on a random horse is worth around 30% of your stake, or betting on the favourite is worth 5% of your stake, then go for it. As long as you&amp;rsquo;re not betting more than you can afford, then I say analyses be damned and simply enjoy the thrill of the punt.&lt;/p></description></item><item><title>AFL Bets - Analysing the Halftime Payout</title><link>https://clt.blog.foletta.net/post/afl-payouts-at-halftime/</link><pubDate>Thu, 13 Aug 2020 00:00:00 +0000</pubDate><guid>https://clt.blog.foletta.net/post/afl-payouts-at-halftime/</guid><description>&lt;p>If you&amp;rsquo;ve watched AFL over the past few years, you would have noticed betting companies spruiking a cornucopia of betting options. In fact it would be hard for you not to notice, given the way they yell at you down the television screen.&lt;/p>
&lt;p>One of the value adds these companies advertise is the &amp;lsquo;goal up at halftime&amp;rsquo; payout. The terms are that if you have a head-to-head bet on the game, and your team is up by 6 points or more at half time, you&amp;rsquo;ll be paid out as if you had won.&lt;/p>
&lt;p>Betting companies aren&amp;rsquo;t in the business of giving away money, so they must be confident that, in the long run, the team that is up at half time almost always goes on to win the game. But how confident should they be? Is this something that we can calculate?&lt;/p>
&lt;p>Good data analysis always starts off with a question. In this article I will try to answer the question:&lt;/p>
&lt;blockquote>
&lt;p>If a betting company pays out an AFL head-to-head bet at halftime because a team is up by 6 points or more, in the long run what proportion of bets will the betting company payout that they wouldn&amp;rsquo;t have if they didn&amp;rsquo;t offer this option?&lt;/p>
&lt;/blockquote>
&lt;p>There are two scenarios to consider:&lt;/p>
&lt;ol>
&lt;li>A team is up at half time and goes on to win.&lt;/li>
&lt;li>A team is down at half time and goes on to win.&lt;/li>
&lt;/ol>
&lt;p>With 1, a betting company loses nothing paying out at half time, as they will have had to have paid out the bet anyway. However with 2, the betting company does lose, as they&amp;rsquo;re paying out both teams. We want to determine how often this scenario plays out.&lt;/p>
&lt;h1 id="model-notes-assumptions">Model Notes Assumptions&lt;/h1>
&lt;p>The terms of the payout are clear-cut: head-to-head bet, team is up by a goal, bet is paid out. We want to model the probability of a win (the result) versus the half time differential (the predictor):&lt;/p>
&lt;p>$$ Pr(R = Win | S) $$&lt;/p>
&lt;p>Where \(R\) is the result (Win, Loss), and \(S\) is the score differential at halftime.&lt;/p>
&lt;p>The main assumption we need to make is around timing: we assume that this halftime payout is applied across all games. This allows us to discount all other variables such as the teams&amp;rsquo; odds, weather conditions, team form, etc. What is likely is that the betting companies have far more complex statistical models that take into account a wide range of variables. They can then &amp;lsquo;turn on&amp;rsquo; this offer on specific games when the probabilities are in their favour.&lt;/p>
&lt;p>Another item to consider is how far to go back in time to train our model. It could be that the game has changed significantly in the past few years, making this kind of payout more feasible. We will attempt to model this by adding a categorical variable representing the league a game was played in: the Victorian Football League (VFL) or the Australian Football League (AFL). We can then determine whether this has statistical significance, and whether it increases the accuracy of our model.&lt;/p>
&lt;h1 id="what-about-a-draw">What About a Draw?&lt;/h1>
&lt;p>Spoiler alert: we&amp;rsquo;re going to be using a logistic regression in our model. This allows us to model a binary outcome, but we actually have three outcomes: loss, win and draw.&lt;/p>
&lt;p>In a draw, the head-to-head bet is paid out at half the face value. So if a team is up at halftime and paid out, and the game goes on to be a draw, the betting company will still have paid out more than they had to. As we&amp;rsquo;re looking at &amp;lsquo;the &lt;strong>proportion&lt;/strong> of bets the betting company payout that they wouldn&amp;rsquo;t have if they didn&amp;rsquo;t offer this option&amp;rsquo;, we will consider a draw to be a loss.&lt;/p>
&lt;h1 id="loading-and-transforming-the-data">Loading and Transforming the Data&lt;/h1>
&lt;p>Our data for this analysis will come from the &lt;a href="https://afltables.com/afl/afl_index.html">AFL Tables&lt;/a> site, via the &lt;code>fitzRoy&lt;/code> R package. The data is received in a long format that includes statistics for each player. We&amp;rsquo;re only concerned with team statistics, not player statistics, so the first row is taken from each game and transmuted into the variables we require.&lt;/p>
&lt;p>As discussed in the assumptions, we&amp;rsquo;d like to see if there&amp;rsquo;s a difference between the VFL and AFL leagues. A categorical variable is added denoting whether the game was played as part of the &amp;lsquo;Australian Football League&amp;rsquo; or &amp;lsquo;Victorian Football League&amp;rsquo;, which changed in 1990.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#a6e22e">library&lt;/span>(tidyverse)
&lt;span style="color:#a6e22e">library&lt;/span>(magrittr)
&lt;span style="color:#a6e22e">library&lt;/span>(fitzRoy)
&lt;span style="color:#a6e22e">library&lt;/span>(modelr)
&lt;span style="color:#a6e22e">library&lt;/span>(lubridate)
&lt;span style="color:#75715e"># Download the AFL statistics&lt;/span>
afl_match_data &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">get_afltables_stats&lt;/span>()
&lt;span style="color:#75715e"># Group by each unique game and take the first row from each. &lt;/span>
&lt;span style="color:#75715e"># Transmute into the required data.&lt;/span>
afl_ht_results &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
afl_match_data &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(
Game_ID &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">group_indices&lt;/span>(., Season, Round, Home.team, Away.team)
) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">group_by&lt;/span>(Game_ID) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">slice&lt;/span>(&lt;span style="color:#ae81ff">1&lt;/span>) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">transmute&lt;/span>(
Home_HT.Diff &lt;span style="color:#f92672">=&lt;/span> (&lt;span style="color:#ae81ff">6&lt;/span> &lt;span style="color:#f92672">*&lt;/span> HQ2G &lt;span style="color:#f92672">+&lt;/span> HQ2B) &lt;span style="color:#f92672">-&lt;/span> (&lt;span style="color:#ae81ff">6&lt;/span> &lt;span style="color:#f92672">*&lt;/span> AQ2G &lt;span style="color:#f92672">+&lt;/span> AQ2B),
Away_HT.Diff &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#f92672">-&lt;/span>Home_HT.Diff,
Home_Result &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">as_factor&lt;/span>(&lt;span style="color:#a6e22e">ifelse&lt;/span>(Home.score &lt;span style="color:#f92672">&amp;gt;&lt;/span> Away.score, &lt;span style="color:#e6db74">&amp;#39;Win&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;Loss&amp;#39;&lt;/span>)),
Away_Result &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">as_factor&lt;/span>(&lt;span style="color:#a6e22e">ifelse&lt;/span>(Away.score &lt;span style="color:#f92672">&amp;gt;&lt;/span> Home.score, &lt;span style="color:#e6db74">&amp;#39;Win&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;Loss&amp;#39;&lt;/span>)),
League &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">ifelse&lt;/span>(&lt;span style="color:#a6e22e">year&lt;/span>(Date) &lt;span style="color:#f92672">&amp;gt;=&lt;/span> &lt;span style="color:#ae81ff">1990&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;AFL&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;VFL&amp;#39;&lt;/span>)
) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">ungroup&lt;/span>()
&lt;span style="color:#a6e22e">print&lt;/span>(afl_ht_results)
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code># A tibble: 15,705 x 6
Game_ID Home_HT.Diff Away_HT.Diff Home_Result Away_Result League
&amp;lt;int&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;fct&amp;gt; &amp;lt;fct&amp;gt; &amp;lt;chr&amp;gt;
1 1 13 -13 Win Loss VFL
2 2 2 -2 Win Loss VFL
3 3 -18 18 Loss Win VFL
4 4 -17 17 Loss Win VFL
5 5 -14 14 Loss Win VFL
6 6 20 -20 Win Loss VFL
7 7 23 -23 Win Loss VFL
8 8 -41 41 Loss Win VFL
9 9 -24 24 Loss Win VFL
10 10 -15 15 Loss Win VFL
# … with 15,695 more rows
&lt;/code>&lt;/pre>&lt;p>We have one row per game, but our observations are focused on each team rather than each individual game. We pivot the data to give us the half time differential and result per team, which results in two rows per game. Note the ability of &lt;code>pivot_longer()&lt;/code> to extract out more than two columns at once using the &lt;code>names_sep&lt;/code> argument.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">afl_ht_results &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
afl_ht_results &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">pivot_longer&lt;/span>(
&lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#a6e22e">c&lt;/span>(Game_ID, League),
names_to &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">c&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;Team&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;.value&amp;#39;&lt;/span>),
names_sep &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;_&amp;#39;&lt;/span>,
values_drop_na &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#66d9ef">TRUE&lt;/span>
) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">select&lt;/span>(&lt;span style="color:#f92672">-&lt;/span>Team)
&lt;span style="color:#a6e22e">print&lt;/span>(afl_ht_results)
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code># A tibble: 31,410 x 4
Game_ID League HT.Diff Result
&amp;lt;int&amp;gt; &amp;lt;chr&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;fct&amp;gt;
1 1 VFL 13 Win
2 1 VFL -13 Loss
3 2 VFL 2 Win
4 2 VFL -2 Loss
5 3 VFL -18 Loss
6 3 VFL 18 Win
7 4 VFL -17 Loss
8 4 VFL 17 Win
9 5 VFL -14 Loss
10 5 VFL 14 Win
# … with 31,400 more rows
&lt;/code>&lt;/pre>&lt;p>In this format there is a lot of redundancy: each game has two rows in our data frame, with each row simply being the negation of the other row. We remove this redundancy by taking, at random, either one team&amp;rsquo;s variable from each game.&lt;/p>
&lt;p>At this point, we have tidied and transformed our data into the variables we require and into the shape we need. We can now start to analyse and use it for modeling.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">afl_ht_sample &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
afl_ht_results &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">group_by&lt;/span>(Game_ID) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">sample_frac&lt;/span>(&lt;span style="color:#ae81ff">.5&lt;/span>) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">ungroup&lt;/span>()
&lt;span style="color:#a6e22e">print&lt;/span>(afl_ht_sample)
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code># A tibble: 15,705 x 4
Game_ID League HT.Diff Result
&amp;lt;int&amp;gt; &amp;lt;chr&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;fct&amp;gt;
1 1 VFL 13 Win
2 2 VFL 2 Win
3 3 VFL -18 Loss
4 4 VFL -17 Loss
5 5 VFL 14 Win
6 6 VFL 20 Win
7 7 VFL 23 Win
8 8 VFL -41 Loss
9 9 VFL 24 Win
10 10 VFL 15 Win
# … with 15,695 more rows
&lt;/code>&lt;/pre>&lt;p>Let&amp;rsquo;s take a look at the win/loss ratios for each halftime differential, splitting on whether which league the game was played.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># Graph the win/loss ratios by the halftime differential&lt;/span>
afl_ht_sample &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">group_by&lt;/span>(HT.Diff, League) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">summarise&lt;/span>(Ratio &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">mean&lt;/span>(Result &lt;span style="color:#f92672">==&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Win&amp;#39;&lt;/span>), .groups &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;keep&amp;#39;&lt;/span>) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">ggplot&lt;/span>() &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">geom_point&lt;/span>(&lt;span style="color:#a6e22e">aes&lt;/span>(HT.Diff, Ratio, colour &lt;span style="color:#f92672">=&lt;/span> League)) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">labs&lt;/span>(
x &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Half Time Difference&amp;#39;&lt;/span>,
y &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Win/Loss Ratio&amp;#39;&lt;/span>,
title &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;AFL Games - All Games&amp;#39;&lt;/span>,
subtitle &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Half Time Difference vs. Win/Loss Ratio&amp;#39;&lt;/span>
)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2020-06-15-afl-payouts-at-halftime_files/figure-html/unnamed-chunk-5-1.png" width="672" />&lt;/p>
&lt;p>As mentioned earlier, we knew that the logistic regression would be the likley method used to model this data. This graph, with it&amp;rsquo;s clear sigmoid-shaped curve, confirms that a logistic regression is an appropriate choice.&lt;/p>
&lt;h1 id="modeling">Modeling&lt;/h1>
&lt;p>Our data is in the right shape, so we now use it to create a model. The data is split into training and test sets, with 80% of the observations in the training set and 20% left over for final testing.&lt;/p>
&lt;p>A logistic regression is then used to model the result of the game against the halftime differential, the league, and also take into account any interaction between the league and the halftime differential. As a learning exercise I&amp;rsquo;ve decided to use the &lt;a href="https://www.tidymodels.org/">tidymodels&lt;/a> approach to run the regression.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#a6e22e">library&lt;/span>(tidymodels)
&lt;span style="color:#75715e"># Split the data into training and test sets.&lt;/span>
&lt;span style="color:#a6e22e">set.seed&lt;/span>(&lt;span style="color:#ae81ff">1&lt;/span>)
afl_ht_sets &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
afl_ht_sample &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">initial_split&lt;/span>(prop &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">.8&lt;/span>)
&lt;span style="color:#75715e"># Define our model and engine.&lt;/span>
afl_ht_model &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
&lt;span style="color:#a6e22e">logistic_reg&lt;/span>() &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">set_engine&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;glm&amp;#39;&lt;/span>)
&lt;span style="color:#75715e"># Fit our model on the training set&lt;/span>
afl_ht_fit &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
afl_ht_model &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">fit&lt;/span>(Result &lt;span style="color:#f92672">~&lt;/span> HT.Diff &lt;span style="color:#f92672">*&lt;/span> League, data &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">training&lt;/span>(afl_ht_sets))
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Let&amp;rsquo;s see how this model looks against the the training data:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># View the logistic regression against the win/loss ratios&lt;/span>
&lt;span style="color:#a6e22e">training&lt;/span>(afl_ht_sets) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">group_by&lt;/span>(HT.Diff, League) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">summarise&lt;/span>(Ratio &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">mean&lt;/span>(Result &lt;span style="color:#f92672">==&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Win&amp;#39;&lt;/span>), .groups &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;keep&amp;#39;&lt;/span>) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">bind_cols&lt;/span>(&lt;span style="color:#a6e22e">predict&lt;/span>(afl_ht_fit, new_data &lt;span style="color:#f92672">=&lt;/span> ., type &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;prob&amp;#39;&lt;/span>)) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">ggplot&lt;/span>() &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">geom_point&lt;/span>(&lt;span style="color:#a6e22e">aes&lt;/span>(HT.Diff, Ratio, colour &lt;span style="color:#f92672">=&lt;/span> League), alpha &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">.3&lt;/span>) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">geom_line&lt;/span>(&lt;span style="color:#a6e22e">aes&lt;/span>(HT.Diff, .pred_Win, colour &lt;span style="color:#f92672">=&lt;/span> League))
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2020-06-15-afl-payouts-at-halftime_files/figure-html/unnamed-chunk-7-1.png" width="672" />&lt;/p>
&lt;p>We see it fits the data well, and that there doesn&amp;rsquo;t appear visually to be much of a difference between the VFL era and the AFL era.&lt;/p>
&lt;h2 id="probabilities">Probabilities&lt;/h2>
&lt;p>We&amp;rsquo;re concerned with half time differentials above 6, so let&amp;rsquo;s look at some of the probabilities our model spits out for one, two and three goal leads at half time.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># 1, 2 and 3 goal leads in the two leagues&lt;/span>
prob_data &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
&lt;span style="color:#a6e22e">crossing&lt;/span>(
HT.Diff &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">c&lt;/span>(&lt;span style="color:#ae81ff">1&lt;/span>, &lt;span style="color:#ae81ff">2&lt;/span>, &lt;span style="color:#ae81ff">3&lt;/span>) &lt;span style="color:#f92672">*&lt;/span> &lt;span style="color:#ae81ff">6&lt;/span>,
League &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">c&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;AFL&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;VFL&amp;#39;&lt;/span>)
)
&lt;span style="color:#75715e"># Prediction across this data&lt;/span>
afl_ht_fit &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">predict&lt;/span>(new_data &lt;span style="color:#f92672">=&lt;/span> prob_data, type &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;prob&amp;#39;&lt;/span>) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">bind_cols&lt;/span>(prob_data) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(Percent_Win &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">round&lt;/span>(.pred_Win &lt;span style="color:#f92672">*&lt;/span> &lt;span style="color:#ae81ff">100&lt;/span>, &lt;span style="color:#ae81ff">2&lt;/span>)) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">select_at&lt;/span>(&lt;span style="color:#a6e22e">vars&lt;/span>(&lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#a6e22e">starts_with&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;.pred&amp;#39;&lt;/span>)))
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code># A tibble: 6 x 3
HT.Diff League Percent_Win
&amp;lt;dbl&amp;gt; &amp;lt;chr&amp;gt; &amp;lt;dbl&amp;gt;
1 6 AFL 62.1
2 6 VFL 63.0
3 12 AFL 73.1
4 12 VFL 74.9
5 18 AFL 81.9
6 18 VFL 83.9
&lt;/code>&lt;/pre>&lt;p>Our model gives mid-60%, mid-70% and mid-80% probabilities in both leagues for teams leading by one, two and three goals respectively. For the purposes of this article we&amp;rsquo;re going to use a decision threshold of 50% as between the &amp;lsquo;Win&amp;rsquo; and &amp;lsquo;Loss&amp;rsquo; categories.&lt;/p>
&lt;p>What this means is that our model will always predict a win if a team is leading by a goal or more, and thus for the us there will only be two error types:&lt;/p>
&lt;ul>
&lt;li>True Positives - predicting a win when the result is a win.&lt;/li>
&lt;li>False Positives - predicting a win when the result is a loss.&lt;/li>
&lt;/ul>
&lt;p>At this point we could move to estimate the expected proportion of payouts simply looking at the win/loss ratios for halftime differentials of 6 point or more. But before we do that, let&amp;rsquo;s perform some further diagnostics on our model to ensure that we&amp;rsquo;re not making invalid assumptions.&lt;/p>
&lt;h2 id="training-accuracy">Training Accuracy&lt;/h2>
&lt;p>The next step is to look at the accuracy of our model against the training set - that is - the same set of data that model was built upon.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># Calculate training accuracy&lt;/span>
afl_ht_fit &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">predict&lt;/span>(&lt;span style="color:#a6e22e">training&lt;/span>(afl_ht_sets)) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">bind_cols&lt;/span>(&lt;span style="color:#a6e22e">training&lt;/span>(afl_ht_sets)) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">accuracy&lt;/span>(Result, .pred_class)
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code># A tibble: 1 x 3
.metric .estimator .estimate
&amp;lt;chr&amp;gt; &amp;lt;chr&amp;gt; &amp;lt;dbl&amp;gt;
1 accuracy binary 0.784
&lt;/code>&lt;/pre>&lt;p>So 78.4% of the time the model predicts the correct result. That&amp;rsquo;s good, but we need to remember that the model was generated from the same data so it&amp;rsquo;s going to be optimistic. THe test accuracy is likely to be lower.&lt;/p>
&lt;h2 id="model-coefficients">Model Coefficients&lt;/h2>
&lt;p>We&amp;rsquo;ve looked at the outputs of the model: probabilities and accuracy. But what is the model actually telling us about the relationship between halftime differential and league to the probability of a win? This is where the actual values of the model coefficients come in. Our logistic function will look as such:&lt;/p>
&lt;p>$$ p(X) = \frac{
e^{\beta_0 + \beta_1 X_1 + \beta_2 X_2 + \beta_3 X_1 X_2}
}{
1 - e^{\beta_0 + \beta_1 X_1 + \beta_2 X_2 + \beta_3 X_1 X_2}
}$$`&lt;/p>
&lt;p>where \(X_1\) is the halftime differential in points, and \(X_2\) is a categorical variable denoting the league the game was played in: &amp;lsquo;AFL&amp;rsquo; or &amp;lsquo;VFL&amp;rsquo;.&lt;/p>
&lt;p>We also need to remember that the result of our model is not a probability, but the log-odds or logit of the result:&lt;/p>
&lt;p>$$ logit(p) = log(\frac{p}{1 - p}) $$
where \(p\) is the probability of the result being a win.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># Model coefficients&lt;/span>
afl_ht_fit &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">tidy&lt;/span>()
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code># A tibble: 4 x 5
term estimate std.error statistic p.value
&amp;lt;chr&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt;
1 (Intercept) 0.0139 0.0392 0.355 7.23e- 1
2 HT.Diff -0.0846 0.00252 -33.6 8.02e-248
3 LeagueVFL 0.0126 0.0487 0.259 7.96e- 1
4 HT.Diff:LeagueVFL -0.00869 0.00332 -2.62 8.76e- 3
&lt;/code>&lt;/pre>&lt;p>The model coefficients are as such:&lt;/p>
&lt;ul>
&lt;li>The intercept (\(\beta_0\)) tells us the log-odds of winning in the AFL with a halftime differential of zero.&lt;/li>
&lt;li>&lt;code>HT.Diff&lt;/code> (\(\beta_0\)) is the change in log-odds of a win in the AFL for every one point of halftime differential.&lt;/li>
&lt;li>&lt;code>LeagueVFL&lt;/code> (\(\beta_1\)) tells us the &lt;em>difference&lt;/em> in log-odds of winning with a differential of zero in the VFL as compared to the AFL.
&lt;ul>
&lt;li>This needs to be added to the intercept to when considering VFL games.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;code>HT.Diff:LeagueVFL&lt;/code> (\(\beta_3\)) is the difference in the change in log-odds of a win for every one point of halftime differential in the VFL.
&lt;ul>
&lt;li>Again, this needs to be added to the &lt;code>HT.Diff&lt;/code> variable when considering VFL games.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>So for an AFL game, for each point a team is leading by at half time, their odds of winning increase by \(e^{-0.0846468}\) or 0.9188368. For a VFL game, it&amp;rsquo;s \(e^{-0.0846468 + (-0.0086936)}\) or 0.9108834.&lt;/p>
&lt;p>Looking at the p-values, if we assume a standard significance value of \(\alpha = 0.05\), then the halftime difference is highly significant, and that there is a slight significance between the change in log-odds per halftime differential between the VFL and the AFL. So we can say that the odds of winning given a lead at halftime were slightly less in the VFL era as compared to the AFL era.&lt;/p>
&lt;p>We&amp;rsquo;re building this model in order to &lt;em>predict&lt;/em> the results, not in order to &lt;em>explain&lt;/em> how each variable affects the outcome. As such, the statistical significance of each predictor isn&amp;rsquo;t that relevant to us. But it does raise a question: should we include the statistically insignificant predictors in our model or not?&lt;/p>
&lt;p>To answer this question, we&amp;rsquo;ll use bootstrapping.&lt;/p>
&lt;h1 id="bootstrapping">Bootstrapping&lt;/h1>
&lt;p>With the bootstrap, we take a sample from the training set a number of times &lt;em>with replacement&lt;/em>, run our model across this data, and and record the accuracy. The mean accuracy is then calculated across all of these runs.&lt;/p>
&lt;p>We&amp;rsquo;ll create two models: one with the league included, and a model without. The model with the best accuracy from this bootstrap is the model that will be used.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># Recipe A: Result vs Halftime Difference&lt;/span>
afl_recipe_ht &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
&lt;span style="color:#a6e22e">training&lt;/span>(afl_ht_sets) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">recipe&lt;/span>(Result &lt;span style="color:#f92672">~&lt;/span> HT.Diff) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">step_dummy&lt;/span>(&lt;span style="color:#a6e22e">all_nominal&lt;/span>(), &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#a6e22e">all_outcomes&lt;/span>())
&lt;span style="color:#75715e"># Bootstrap Recipe A&lt;/span>
&lt;span style="color:#a6e22e">workflow&lt;/span>() &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">add_model&lt;/span>(afl_ht_model) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">add_recipe&lt;/span>(afl_recipe_ht) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">fit_resamples&lt;/span>(&lt;span style="color:#a6e22e">bootstraps&lt;/span>(&lt;span style="color:#a6e22e">training&lt;/span>(afl_ht_sets), times &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">50&lt;/span>)) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">collect_metrics&lt;/span>()
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code># A tibble: 2 x 5
.metric .estimator mean n std_err
&amp;lt;chr&amp;gt; &amp;lt;chr&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;int&amp;gt; &amp;lt;dbl&amp;gt;
1 accuracy binary 0.784 50 0.000765
2 roc_auc binary 0.869 50 0.000646
&lt;/code>&lt;/pre>&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># Recipe B: Result vs Halftime Difference, League, and interaction term &lt;/span>
afl_recipe_ht_league &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
&lt;span style="color:#a6e22e">training&lt;/span>(afl_ht_sets) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">recipe&lt;/span>(Result &lt;span style="color:#f92672">~&lt;/span> HT.Diff &lt;span style="color:#f92672">+&lt;/span> League) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">step_dummy&lt;/span>(&lt;span style="color:#a6e22e">all_nominal&lt;/span>(), &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#a6e22e">all_outcomes&lt;/span>()) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">step_interact&lt;/span>(&lt;span style="color:#f92672">~&lt;/span>League_VFL&lt;span style="color:#f92672">:&lt;/span>HT.Diff)
&lt;span style="color:#75715e"># Bootstrap recipe B&lt;/span>
&lt;span style="color:#a6e22e">workflow&lt;/span>() &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">add_model&lt;/span>(afl_ht_model) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">add_recipe&lt;/span>(afl_recipe_ht_league) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">fit_resamples&lt;/span>(&lt;span style="color:#a6e22e">bootstraps&lt;/span>(&lt;span style="color:#a6e22e">training&lt;/span>(afl_ht_sets), times &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">50&lt;/span>)) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">collect_metrics&lt;/span>()
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code># A tibble: 2 x 5
.metric .estimator mean n std_err
&amp;lt;chr&amp;gt; &amp;lt;chr&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;int&amp;gt; &amp;lt;dbl&amp;gt;
1 accuracy binary 0.784 50 0.000618
2 roc_auc binary 0.869 50 0.000525
&lt;/code>&lt;/pre>&lt;p>The result: there is hardly any difference between a model with only halftime difference, and a model that takes into account the league the game was played in.&lt;/p>
&lt;h1 id="final-testing">Final Testing&lt;/h1>
&lt;p>A model needs to be chosen, and so the model with the league predictor is the one that will be used. How does the model stack up against the test set:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">afl_ht_fit &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">predict&lt;/span>(&lt;span style="color:#a6e22e">testing&lt;/span>(afl_ht_sets)) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">bind_cols&lt;/span>(&lt;span style="color:#a6e22e">testing&lt;/span>(afl_ht_sets)) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">accuracy&lt;/span>(Result, .pred_class)
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code># A tibble: 1 x 3
.metric .estimator .estimate
&amp;lt;chr&amp;gt; &amp;lt;chr&amp;gt; &amp;lt;dbl&amp;gt;
1 accuracy binary 0.787
&lt;/code>&lt;/pre>&lt;p>Our test accuracy is in fact slightly better than our training accuracy!&lt;/p>
&lt;p>This accuracy is across the whole gamut of halftime differentials, but we&amp;rsquo;re only concerned with halftime differentials of 6 points or more:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># Filter out differentials of 6 points or more&lt;/span>
afl_ht_testing_subset &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
&lt;span style="color:#a6e22e">testing&lt;/span>(afl_ht_sets) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">filter&lt;/span>(HT.Diff &lt;span style="color:#f92672">&amp;gt;=&lt;/span> &lt;span style="color:#ae81ff">6&lt;/span>)
&lt;span style="color:#75715e"># Apply the model to this subset of data &lt;/span>
afl_ht_testing_subset_fit &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
afl_ht_fit &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">predict&lt;/span>(afl_ht_testing_subset) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">bind_cols&lt;/span>(afl_ht_testing_subset)
&lt;span style="color:#75715e"># Test set, goal or more accuracy&lt;/span>
afl_ht_testing_subset_fit &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">accuracy&lt;/span>(Result, .pred_class)
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code># A tibble: 1 x 3
.metric .estimator .estimate
&amp;lt;chr&amp;gt; &amp;lt;chr&amp;gt; &amp;lt;dbl&amp;gt;
1 accuracy binary 0.835
&lt;/code>&lt;/pre>&lt;p>Our model, applied to the subset of the test data we&amp;rsquo;re concerned with, is 84% accurate.&lt;/p>
&lt;h1 id="conclusion">Conclusion&lt;/h1>
&lt;p>How do these values we&amp;rsquo;ve calculated relate to the payouts a betting company has to deliver? If a head-to-bet has been placed on a team and that team is up by 6 points or more at half time, we estimate that 84% of the time they will go on to win. The betting company would have had to pay this out anyway, so scenario does not have an affect on the payout.&lt;/p>
&lt;p>However with the &amp;lsquo;payout at halftime&amp;rsquo; deal in place, there are times when a team is down by 6 points or more and at half-time and goes on to win. From our model and data, we see this occurring 16% of the time.&lt;/p>
&lt;p>Therefore, with this deal in place, on head-to-head bets, we would expect the betting companies to pay-out 16/84 or 19.05% more times than they would if the deal was not in place. We note that this includes where a the result is a draw, and the company would only pay out half of the head-to-head bet.&lt;/p></description></item><item><title>Simulating Snakes and Ladders</title><link>https://clt.blog.foletta.net/post/snakes-and-ladders/</link><pubDate>Wed, 03 Jun 2020 00:00:00 +0000</pubDate><guid>https://clt.blog.foletta.net/post/snakes-and-ladders/</guid><description>&lt;p>For the past couple of months my family and I - like the rest of the world - have been in isolation due to the coronavirus. My eldest son Ned is 5 years old and is interested in games and puzzles at moment, so these have been a key tool in reducing the boredom of lockdown.&lt;/p>
&lt;p>Snakes and ladders is one of the games that&amp;rsquo;s caught his attenton. While sitting on the floor and playing a game for the umpteenth time, I started to wonder about some of the game&amp;rsquo;s statistical properties. That&amp;rsquo;s normal, right?&lt;/p>
&lt;p>In this article I want to try and answer two questions about snakes and ladders. The first is:&lt;/p>
&lt;blockquote>
&lt;p>For my son&amp;rsquo;s board, what is the average amount of dice rolls it takes to finish a game?&lt;/p>
&lt;/blockquote>
&lt;p>And the second is:&lt;/p>
&lt;blockquote>
&lt;p>What is the average amount of dice rolls it takes to finish a game for a generalised board?&lt;/p>
&lt;/blockquote>
&lt;h1 id="defining-the-board">Defining the Board&lt;/h1>
&lt;p>This is the board we play on - it&amp;rsquo;s large sheet of plastic, hence the crinkles:&lt;/p>
&lt;p>&lt;img src="https://clt.blog.foletta.net/post/snakes_and_ladders/board.jpg" alt="Our Snakes and Ladders Board">&lt;/p>
&lt;p>A snakes and ladders board can be represented as a vector, with each element of the vector representing a square or &amp;lsquo;spot&amp;rsquo; on the board. Each element holds the value of the shift that occurs when you land on it: negative for snakes, positive for ladders, or zero for neither.&lt;/p>
&lt;p>The vector below is a representation of my son&amp;rsquo;s board. We&amp;rsquo;re letting R do the calculations for us here, entering values as &lt;em>destination - source&lt;/em> for ladders and &lt;em>source - destinaton&lt;/em> for snakes.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">neds_board &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">c&lt;/span>(
&lt;span style="color:#ae81ff">38-1&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">14-4&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">31-9&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>,
&lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">6-16&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>,
&lt;span style="color:#ae81ff">42-21&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">84-28&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>,
&lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">44-36&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>,
&lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">26-47&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">11-49&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>,
&lt;span style="color:#ae81ff">67-51&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">53-56&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>,
&lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">19-62&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">60-64&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>,
&lt;span style="color:#ae81ff">91-71&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">100-80&lt;/span>,
&lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">24-87&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>,
&lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">73-93&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">75-95&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">78-98&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#ae81ff">0&lt;/span>
)
&lt;/code>&lt;/pre>&lt;/div>&lt;h1 id="playing-the-game">Playing the Game&lt;/h1>
&lt;p>With a data structure that represents the board, we now we need an algorithm that represents the game.&lt;/p>
&lt;p>The &lt;code>snl_game()&lt;/code> function takes a vector defining a board, and a finish type, and runs through a single player game until the game is complete, returning the number of rolls it took to finish the game.&lt;/p>
&lt;p>The finish type specfies one of the two different ways a game can be finished. My son and I play an &amp;lsquo;over&amp;rsquo; finish type, where any dice roll that takes you over the last spot on the board length results in a win.&lt;/p>
&lt;p>The other finish is the &amp;lsquo;exact&amp;rsquo; type, where you need to land exactly one spot past the last spot on the board to win. If you roll a value that takes you over, you remain in your current place.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#a6e22e">library&lt;/span>(tidyverse)
&lt;span style="color:#a6e22e">library&lt;/span>(magrittr)
&lt;span style="color:#a6e22e">library&lt;/span>(glue)
&lt;span style="color:#a6e22e">library&lt;/span>(knitr)
&lt;span style="color:#a6e22e">library&lt;/span>(kableExtra)
&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">snl_game &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">function&lt;/span>(board, finish &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;exact&amp;#39;&lt;/span>) {
&lt;span style="color:#a6e22e">if &lt;/span>(&lt;span style="color:#f92672">!&lt;/span>finish &lt;span style="color:#f92672">%in%&lt;/span> &lt;span style="color:#a6e22e">c&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;exact&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;over&amp;#39;&lt;/span>)) {
&lt;span style="color:#a6e22e">stop&lt;/span>(&lt;span style="color:#e6db74">&amp;#34;Argument &amp;#39;finish&amp;#39; must be either &amp;#39;exact&amp;#39; or &amp;#39;over&amp;#34;&lt;/span>)
}
&lt;span style="color:#75715e"># We sart on 0, which is off the board. First space is 1&lt;/span>
pos &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span>
&lt;span style="color:#75715e"># We finish one past the end of the board&lt;/span>
fin_pos &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">length&lt;/span>(board) &lt;span style="color:#f92672">+&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>
&lt;span style="color:#75715e"># roll counter&lt;/span>
rolls &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span>
&lt;span style="color:#a6e22e">while &lt;/span>(rolls &lt;span style="color:#f92672">&amp;lt;-&lt;/span> rolls &lt;span style="color:#f92672">+&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>) {
&lt;span style="color:#75715e"># Roll the dice&lt;/span>
roll &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">sample&lt;/span>(&lt;span style="color:#ae81ff">1&lt;/span>&lt;span style="color:#f92672">:&lt;/span>&lt;span style="color:#ae81ff">6&lt;/span>, &lt;span style="color:#ae81ff">1&lt;/span>)
&lt;span style="color:#75715e"># Update the position&lt;/span>
next_pos &lt;span style="color:#f92672">&amp;lt;-&lt;/span> pos &lt;span style="color:#f92672">+&lt;/span> roll
&lt;span style="color:#75715e"># Two types of finish:&lt;/span>
&lt;span style="color:#75715e"># a) We need an exact roll to win&lt;/span>
&lt;span style="color:#75715e"># b) We need any roll to win&lt;/span>
&lt;span style="color:#a6e22e">if &lt;/span>(next_pos &lt;span style="color:#f92672">&amp;gt;&lt;/span> fin_pos) {
&lt;span style="color:#a6e22e">if &lt;/span>(finish &lt;span style="color:#f92672">==&lt;/span> &lt;span style="color:#e6db74">&amp;#39;exact&amp;#39;&lt;/span>) { next }
else { &lt;span style="color:#a6e22e">return&lt;/span>(rolls) }
}
&lt;span style="color:#75715e"># Did we win?&lt;/span>
&lt;span style="color:#a6e22e">if &lt;/span>(next_pos &lt;span style="color:#f92672">==&lt;/span> fin_pos) { &lt;span style="color:#a6e22e">return&lt;/span>(rolls) }
&lt;span style="color:#75715e"># Take into account any snakes/ladders &lt;/span>
pos &lt;span style="color:#f92672">&amp;lt;-&lt;/span> next_pos &lt;span style="color:#f92672">+&lt;/span> board[next_pos]
&lt;span style="color:#75715e"># Did we somehow move off the board in the negative direction?&lt;/span>
&lt;span style="color:#a6e22e">if &lt;/span>(next_pos &lt;span style="color:#f92672">&amp;lt;&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>) {
&lt;span style="color:#a6e22e">warning&lt;/span>(&lt;span style="color:#a6e22e">glue&lt;/span>(&lt;span style="color:#e6db74">&amp;#34;Went into negative board position: {next_pos}&amp;#34;&lt;/span>))
&lt;span style="color:#a6e22e">return&lt;/span>(&lt;span style="color:#66d9ef">NA_integer_&lt;/span>)
}
}
}
&lt;/code>&lt;/pre>&lt;/div>&lt;h1 id="answering-the-specific-question">Answering the Specific Question&lt;/h1>
&lt;p>Now that we&amp;rsquo;ve defined a data structure and an algorithm, let&amp;rsquo;s try and determine the average number of rolls to win on my son&amp;rsquo;s board. Using my new favourite function &lt;code>crossing()&lt;/code>, 200,000 games are simulated for each of the finish types and summary statistics calculated. We visualise the distribution of the numbner of rolls as a histogram:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># Simulate 200,000 games of each finish type &lt;/span>
neds_board_sim &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
&lt;span style="color:#a6e22e">crossing&lt;/span>(
finish_type &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">c&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;exact&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;over&amp;#39;&lt;/span>),
n &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>&lt;span style="color:#f92672">:&lt;/span>&lt;span style="color:#ae81ff">200000&lt;/span>
) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(rolls &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">map_dbl&lt;/span>(finish_type, &lt;span style="color:#f92672">~&lt;/span>&lt;span style="color:#a6e22e">snl_game&lt;/span>(neds_board, finish &lt;span style="color:#f92672">=&lt;/span> .x)))
&lt;span style="color:#75715e"># Summarise the results&lt;/span>
neds_board_summary &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
neds_board_sim &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">group_by&lt;/span>(finish_type) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">summarise&lt;/span>(
min &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">min&lt;/span>(rolls),
max &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">max&lt;/span>(rolls),
mean &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">mean&lt;/span>(rolls),
quantile_95 &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">quantile&lt;/span>(rolls, &lt;span style="color:#ae81ff">.95&lt;/span>),
quantile_5 &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">quantile&lt;/span>(rolls, &lt;span style="color:#ae81ff">.05&lt;/span>)
)
&lt;span style="color:#75715e"># Plot the histograms&lt;/span>
neds_board_sim &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">ggplot&lt;/span>() &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">geom_histogram&lt;/span>(&lt;span style="color:#a6e22e">aes&lt;/span>(rolls), binwidth &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">geom_vline&lt;/span>(
&lt;span style="color:#a6e22e">aes&lt;/span>(xintercept &lt;span style="color:#f92672">=&lt;/span> mean),
linetype &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;dashed&amp;#39;&lt;/span>,
colour &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;red&amp;#39;&lt;/span>,
neds_board_summary
) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">geom_label&lt;/span>(&lt;span style="color:#a6e22e">aes&lt;/span>(label &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">round&lt;/span>(mean, &lt;span style="color:#ae81ff">1&lt;/span>), x &lt;span style="color:#f92672">=&lt;/span> mean, y &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span>), neds_board_summary) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">facet_wrap&lt;/span>(&lt;span style="color:#f92672">~&lt;/span>finish_type, scales &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;free&amp;#39;&lt;/span>) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">labs&lt;/span>(
x &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Number of Dice Rolls&amp;#39;&lt;/span>,
y &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Number of Games&amp;#39;&lt;/span>,
title &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Snakes and Ladders - Dice Roll Histogram&amp;#39;&lt;/span>
)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2020-05-09-snakes-and-ladders_files/figure-html/my_board_simulation-1.png" width="672" />&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#a6e22e">print&lt;/span>(neds_board_summary)
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code># A tibble: 2 x 6
finish_type min max mean quantile_95 quantile_5
&amp;lt;chr&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt;
1 exact 7 288 41.7 90 15
2 over 7 293 36.5 83 12
&lt;/code>&lt;/pre>&lt;p>From this simulated data we&amp;rsquo;ve determined that it takes on average 41.72 rolls to finish an &amp;lsquo;exact&amp;rsquo; game type, and 36.51 rolls to finish an &amp;lsquo;over&amp;rsquo; game type.&lt;/p>
&lt;p>For the &amp;lsquo;over&amp;rsquo; finish type that my son and I play, I estimate a dice roll and move to take around 10 seconds. Our games should on average take around 13 minutes, with 95% of games finishing in less than 28 minutes.&lt;/p>
&lt;h1 id="answering-the-general-question">Answering the General Question&lt;/h1>
&lt;p>We&amp;rsquo;ve answered the specific question, but can we generalise this to any board? To do this, we&amp;rsquo;ll have to provide a way of generating a board.&lt;/p>
&lt;p>There are two random elements that we need to generate: which spots on the board will have a snake or a ladder, and the shift value for each of these spots.&lt;/p>
&lt;p>The first step is to define the the shift - either forwards or backwards - of a single spot. This is done with the &lt;code>spot_alloc()&lt;/code> function below. The shift is taken from a normal distribution (floored to an integer) and &lt;code>min()&lt;/code>/&lt;code>max()&lt;/code> clamped so that we don&amp;rsquo;t shift ourselves off the bottom or the top of the board.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">spot_alloc &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">function&lt;/span>(spot, board_size, mean, sd) {
&lt;span style="color:#75715e"># Integer portion of a random normal variable&lt;/span>
r &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">floor&lt;/span>(&lt;span style="color:#a6e22e">rnorm&lt;/span>(&lt;span style="color:#ae81ff">1&lt;/span>, mean, sd))
&lt;span style="color:#75715e"># Clamp the shift value to within the board limits&lt;/span>
&lt;span style="color:#a6e22e">max&lt;/span>(&lt;span style="color:#f92672">-&lt;/span>(spot &lt;span style="color:#ae81ff">-1&lt;/span>), &lt;span style="color:#a6e22e">min&lt;/span>(board_size &lt;span style="color:#f92672">-&lt;/span> spot, r))
}
&lt;/code>&lt;/pre>&lt;/div>&lt;p>The second step is to generate the board. The &lt;code>snl_board()&lt;/code> function does this, taking a board size, a proportion of the board that will be snakes and ladders, and a desired mean and standard deviation for the snake and ladder shifts.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">snl_board &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">function&lt;/span>(board_size, proportion, mean, sd) {
&lt;span style="color:#75715e"># Allocate the board&lt;/span>
board &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">rep&lt;/span>(&lt;span style="color:#ae81ff">0&lt;/span>, board_size)
&lt;span style="color:#75715e"># Which spots will on the board will be snakes or ladders?&lt;/span>
spots &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">trunc&lt;/span>(&lt;span style="color:#a6e22e">runif&lt;/span>(proportion &lt;span style="color:#f92672">*&lt;/span> board_size, &lt;span style="color:#ae81ff">1&lt;/span>, board_size))
&lt;span style="color:#75715e"># Assign to these spots either a snake or a ladder&lt;/span>
board[spots] &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">map_dbl&lt;/span>(spots, &lt;span style="color:#f92672">~&lt;/span>&lt;span style="color:#a6e22e">spot_alloc&lt;/span>(.x, board_size, mean, sd))
&lt;span style="color:#a6e22e">return&lt;/span>(board)
}
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Due to the clamping, the mean we speciify in our argument to &lt;code>snl_board()&lt;/code> doesn&amp;rsquo;t have a purely linear relationship to the evential mean of the entire board. We can see below that it actually resembles a logistic function.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># Our board generation with only one variable.&lt;/span>
board_generator &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">function&lt;/span>(mean) {
&lt;span style="color:#75715e"># Constant arguments across off of the simulations&lt;/span>
board_length &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#ae81ff">100&lt;/span>
snl_prop &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#ae81ff">.19&lt;/span>
snl_sd &lt;span style="color:#f92672">&amp;lt;-&lt;/span> board_length &lt;span style="color:#f92672">/&lt;/span> &lt;span style="color:#ae81ff">3&lt;/span>
&lt;span style="color:#a6e22e">snl_board&lt;/span>(board_length, snl_prop, mean, snl_sd)
}
&lt;span style="color:#75715e"># Running the simulations&lt;/span>
&lt;span style="color:#a6e22e">crossing&lt;/span>(n &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>&lt;span style="color:#f92672">:&lt;/span>&lt;span style="color:#ae81ff">10&lt;/span>, mean &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">seq&lt;/span>(&lt;span style="color:#ae81ff">-200&lt;/span>, &lt;span style="color:#ae81ff">200&lt;/span>, &lt;span style="color:#ae81ff">3&lt;/span>)) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>( board_mean &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">map_dbl&lt;/span>(mean, &lt;span style="color:#f92672">~&lt;/span>&lt;span style="color:#a6e22e">mean&lt;/span>(&lt;span style="color:#a6e22e">board_generator&lt;/span>(.x))) ) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">ggplot&lt;/span>() &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">geom_point&lt;/span>(&lt;span style="color:#a6e22e">aes&lt;/span>(mean, board_mean), alpha &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">.2&lt;/span>) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">labs&lt;/span>(
x &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Specified Mean&amp;#39;&lt;/span>,
y &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Actual Mean&amp;#39;&lt;/span>,
title &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Specified Mean versus Actual Board Mean&amp;#39;&lt;/span>
)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2020-05-09-snakes-and-ladders_files/figure-html/specified_vs_board_mean-1.png" width="672" />&lt;/p>
&lt;p>With a board and a game we can now run our simulations for the general case. For each game type and mean we&amp;rsquo;ll run 200 simulations.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#a6e22e">set.seed&lt;/span>(&lt;span style="color:#ae81ff">1&lt;/span>)
&lt;span style="color:#75715e"># Simulate our snakes and ladders games for different means and finish types.&lt;/span>
general_snl_sim &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
&lt;span style="color:#a6e22e">crossing&lt;/span>(
n &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>&lt;span style="color:#f92672">:&lt;/span>&lt;span style="color:#ae81ff">200&lt;/span>,
mean &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">-2&lt;/span>&lt;span style="color:#f92672">:&lt;/span>&lt;span style="color:#ae81ff">100&lt;/span>,
finish_type &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">c&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;exact&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;over&amp;#39;&lt;/span>)
) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(
board &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">map&lt;/span>(mean, &lt;span style="color:#f92672">~&lt;/span>&lt;span style="color:#a6e22e">board_generator&lt;/span>(.x)),
board_mean &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">map_dbl&lt;/span>(board, &lt;span style="color:#f92672">~&lt;/span>&lt;span style="color:#a6e22e">mean&lt;/span>(.x)),
rolls &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">map2_dbl&lt;/span>(board, finish_type, &lt;span style="color:#f92672">~&lt;/span>&lt;span style="color:#a6e22e">snl_game&lt;/span>(.x, .y))
)
general_snl_sim &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">ggplot&lt;/span>() &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">geom_point&lt;/span>(&lt;span style="color:#a6e22e">aes&lt;/span>(board_mean, rolls, colour &lt;span style="color:#f92672">=&lt;/span> finish_type), alpha &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">.5&lt;/span>) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">facet_wrap&lt;/span>(&lt;span style="color:#f92672">~&lt;/span>finish_type) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">theme&lt;/span>(legend.position &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;none&amp;#39;&lt;/span>) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">labs&lt;/span>(
x &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Board Mean&amp;#39;&lt;/span>,
y &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Number of Dice Rolls&amp;#39;&lt;/span>,
title &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Simulated Snakes and Ladders&amp;#39;&lt;/span>,
subtitle &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Mean of the Board vs. Number of Dice Rolls&amp;#39;&lt;/span>
)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2020-05-09-snakes-and-ladders_files/figure-html/snl_simulation-1.png" width="672" />&lt;/p>
&lt;p>With the data in hand, we can now attempt to model the number of dice rolls versus the board mean to answer our question.&lt;/p>
&lt;h1 id="modeling">Modeling&lt;/h1>
&lt;p>We&amp;rsquo;ll keep it simple and apply an ordinary least squares to each of the finish types separately.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#a6e22e">library&lt;/span>(broom)
&lt;span style="color:#75715e"># Perform a regression against each group separately&lt;/span>
ols_models &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
general_snl_sim &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">group_by&lt;/span>(finish_type) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">do&lt;/span>(model &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">lm&lt;/span>(rolls &lt;span style="color:#f92672">~&lt;/span> board_mean, data &lt;span style="color:#f92672">=&lt;/span> .) )
&lt;span style="color:#75715e"># Graph the linear regression &lt;/span>
general_snl_sim &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">ggplot&lt;/span>() &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">geom_point&lt;/span>(&lt;span style="color:#a6e22e">aes&lt;/span>(board_mean, rolls, colour &lt;span style="color:#f92672">=&lt;/span> finish_type), alpha &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">.3&lt;/span>) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">geom_smooth&lt;/span>(&lt;span style="color:#a6e22e">aes&lt;/span>(board_mean, rolls), method &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;lm&amp;#39;&lt;/span>, formula &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;y ~ x&amp;#39;&lt;/span>, ) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">facet_wrap&lt;/span>(&lt;span style="color:#f92672">~&lt;/span>finish_type) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">labs&lt;/span>(
x &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Board Mean&amp;#39;&lt;/span>,
y &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Dice Rolls&amp;#39;&lt;/span>,
colour &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Finish Type&amp;#39;&lt;/span>,
title &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Number of Dice Rolls vs Board Mean&amp;#39;&lt;/span>,
subtitle &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Split by Finish Type with OLS Best Fit&amp;#39;&lt;/span>
)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2020-05-09-snakes-and-ladders_files/figure-html/ols_regression-1.png" width="672" />&lt;/p>
&lt;p>The intercepts, which represent a board mean of 0, are 31.9 rolls for the exact finish type, and 27.5 rolls for the over finish type.&lt;/p>
&lt;p>The coefficient of the board mean variable is very similar for both finish types, -2.6 and -2.7 for the exact and over types respectively. This tells us that for every one unit increase in the board mean, the number of rolls to finish a game on average decreases by -2.6 and -2.7 rolls.&lt;/p>
&lt;p>Whenever we discuss a linear model it&amp;rsquo;s not enough to simply discuss coefficients; we also need to discuss what our uncertaintly is. However let me put a pin in this and discuss this shortly when looking at the diagnostics of the fit.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">ols_models &lt;span style="color:#f92672">%&amp;gt;%&lt;/span> &lt;span style="color:#a6e22e">tidy&lt;/span>(model)
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code># A tibble: 4 x 6
# Groups: finish_type [2]
finish_type term estimate std.error statistic p.value
&amp;lt;chr&amp;gt; &amp;lt;chr&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt;
1 exact (Intercept) 31.9 0.181 176. 0
2 exact board_mean -2.64 0.0318 -82.9 0
3 over (Intercept) 27.5 0.172 160. 0
4 over board_mean -2.72 0.0302 -90.1 0
&lt;/code>&lt;/pre>&lt;p>How well does the least squares model the number of roles in terms of the mean of the board? The R-squared value tells us that the linear regression explains around 25% for the exact finish type, and 28% for the over finish type. On first glance that seems low, however it&amp;rsquo;s probably reasonable given the randomness of the dice rolls and the snakes and ladders.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">ols_models &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">glance&lt;/span>(model) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">select&lt;/span>(finish_type, r.squared)
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code># A tibble: 2 x 2
# Groups: finish_type [2]
finish_type r.squared
&amp;lt;chr&amp;gt; &amp;lt;dbl&amp;gt;
1 exact 0.250
2 over 0.283
&lt;/code>&lt;/pre>&lt;p>The next step is to perform some diagnostics on these models. The first thing to look at is a graph of the residuals versus the response variable.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">ols_models &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">augment&lt;/span>(model) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">ggplot&lt;/span>() &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">geom_point&lt;/span>(&lt;span style="color:#a6e22e">aes&lt;/span>(.fitted, .resid, colour &lt;span style="color:#f92672">=&lt;/span> finish_type), alpha &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">.1&lt;/span>) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">geom_smooth&lt;/span>(&lt;span style="color:#a6e22e">aes&lt;/span>(.fitted, .resid), method &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;loess&amp;#39;&lt;/span>, formula &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;y ~ x&amp;#39;&lt;/span>) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">facet_wrap&lt;/span>(&lt;span style="color:#f92672">~&lt;/span>finish_type) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">labs&lt;/span>(
x &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Fitted Value&amp;#39;&lt;/span>,
y &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Residuals&amp;#39;&lt;/span>,
colour &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Finish Type&amp;#39;&lt;/span>,
title &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Residual Diagnostic Plot&amp;#39;&lt;/span>
)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2020-05-09-snakes-and-ladders_files/figure-html/residual_plot-1.png" width="672" />&lt;/p>
&lt;p>There are two things that immediately stand out in this plot - potential non-linearity of the data, the heteroscedacticity of the residuals.&lt;/p>
&lt;h2 id="non-linearity">Non-Linearity&lt;/h2>
&lt;p>Thie first property of the residual graph to notice is the uptick in the shape of the residuals between a fitted value of 30 and 40. This tells us that for fitted values less than 30 (or high board means), the linear regression is a resonably fit to the data. However as the fitted values grow, there doesn&amp;rsquo;t appear to be a linear relationship between the response and predictor.&lt;/p>
&lt;p>The next steps from here would be to either transform the predictor before applying the linear regression, or finding a more flexible model to fit the data on.&lt;/p>
&lt;h2 id="heterocedasticity">Heterocedasticity&lt;/h2>
&lt;p>The second property to notice is the variance of the residuals increasing as the fitted values increase. This manifests itself as a funnel shape in the residual plot. Our residuals are &lt;strong>heteroscedastic&lt;/strong>, rather than &lt;strong>homoscedastic&lt;/strong>. An important assumption of a linear regression model is that the residuals have a constant variance. The standard errors and confidence intervals rely on this assumption.&lt;/p>
&lt;p>This variability is the reason that we didn&amp;rsquo;t take a look at the standard error after performing our regression: given the variability of the residuals, the standard error is not liklely to be providing us with accurate information.&lt;/p>
&lt;p>A possible solution to this is to transform the response using a consave function (square root or log). This results in a greater shrinkage for larger responses, leading to a redicution in heteroscedacticity.&lt;/p>
&lt;h1 id="conclusion">Conclusion&lt;/h1>
&lt;p>At the ouset of this article I wanted to answer two questions: what is the mean number of rolls it takes to finish a snakes and ladders game on a specific board, and what is the mean number of rolls to finish a game on a general board.&lt;/p>
&lt;p>In the specific instance we simulated a large number of games on the specific board. Using this data we were able to determine the mean rolls, as well and lower 5% and upper 95% bounds.&lt;/p>
&lt;p>In the general instance we again simulated a large number of games on boards with different means. We it an ordinary least squares model to the data, but saw two issues: some non-linearity of the data in certain ranges of the independent variable, and heteroscedacticity of the residuals. Further work would be needed - either by transforming the data or by using a more flexible model - to get more accurate estimates and confidence intervals of the mean number of rolls across all board means.&lt;/p></description></item><item><title>Packet Analysis with R (Part 1)</title><link>https://clt.blog.foletta.net/post/packet-analysis-with-r-part-1/</link><pubDate>Sat, 22 Feb 2020 00:00:00 +0000</pubDate><guid>https://clt.blog.foletta.net/post/packet-analysis-with-r-part-1/</guid><description>&lt;p>As a network security consultant I&amp;rsquo;ve spent a my fair share of time trawling through packet captures, looking for that clue or piece of evidence I hope will lead me to the root cause of a problem. Wireshark is the the tool par excellence for interpreting and investigating packet captures, however I&amp;rsquo;ve always found that it&amp;rsquo;s best suited to bottom-up rather than top-down analysis. Opening a packet capture you&amp;rsquo;re bombarded with information, the minutiae from each packet instantly available. This is perfect if you know what you&amp;rsquo;re looking for and can filter out the noise to concentrate on your problem. But if you don&amp;rsquo;t have a clear view of what&amp;rsquo;s in the capture, or where the problem may lie, taking a step back and removing yourself from the details in Wireshark can be difficult.&lt;/p>
&lt;p>Recently, while reading Hadley Wickham&amp;rsquo;s &lt;a href="https://vita.had.co.nz/papers/tidy-data.pdf">R for Data Science&lt;/a> book, the chapter on &lt;a href="https://vita.had.co.nz/papers/tidy-data.pdf">tidy data&lt;/a> resonated with me. For the uninitiated, &amp;lsquo;tidy data&amp;rsquo; is a method of organising columnar data to facilitate easier analysis. It can be summarised by three main properties:&lt;/p>
&lt;ol>
&lt;li>Each variable must have its own column.&lt;/li>
&lt;li>Each observation must have its own row.&lt;/li>
&lt;li>Each value must have its own cell.&lt;/li>
&lt;/ol>
&lt;p>What struck me was that this perfectly described packet captures: each captured packet being and observation, and the values of the dissectors the variables. I wondered whether analysis of packet captures could be done with R and be the top-down compliment to Wireshark&amp;rsquo;s bottom-up approach.&lt;/p>
&lt;p>In this article I want to show how valuable it can be to perform packet analysis using the R language. It&amp;rsquo;s broken into three sections:&lt;/p>
&lt;ul>
&lt;li>Conversion of packet captures to columnar data.&lt;/li>
&lt;li>Creation of Wireshark analogies in R.&lt;/li>
&lt;li>Deeper dive into the packet captures.&lt;/li>
&lt;/ul>
&lt;p>There&amp;rsquo;s nothing too complicated in here: no regressions, no categorisation, no machine learning. It&amp;rsquo;s predominantly about filtering, counting, summarising and visualising aspects of packet captures. What I hope you will see is the power and usefulness in these simple actions.&lt;/p>
&lt;h1 id="pcap-to-csv-transformation">PCAP to CSV Transformation&lt;/h1>
&lt;p>The first step is to convert the packet capture file into a format that R can ingest. I chose the comma separate values format (CSV) for its simplicity and human readability, however &lt;a href="https://www.sqlite.org/index.html">SQLLite&lt;/a> and &lt;a href="https://parquet.apache.org/">Parquet&lt;/a> are other viable options.&lt;/p>
&lt;p>We download a sample packet capture file and run a PCAP to CSV conversion script I&amp;rsquo;ve written over the top of it:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">packet_capture &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#e6db74">&amp;#39;./sample.pcap&amp;#39;&lt;/span>
pcap_to_csv &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#e6db74">&amp;#39;./pcap_to_csv&amp;#39;&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># Download the sample packet capture&lt;/span>
&lt;span style="color:#a6e22e">if &lt;/span>(&lt;span style="color:#f92672">!&lt;/span>&lt;span style="color:#a6e22e">file.exists&lt;/span>(packet_capture)) {
&lt;span style="color:#a6e22e">download.file&lt;/span>(
url &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;https://tinyurl.com/h545tup&amp;#39;&lt;/span>,
destfile &lt;span style="color:#f92672">=&lt;/span> packet_capture
)
}
&lt;span style="color:#75715e"># Download the PCAP to CSV script to the CWD.&lt;/span>
&lt;span style="color:#a6e22e">download.file&lt;/span>(
url &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;https://tinyurl.com/utdbj44&amp;#39;&lt;/span>,
destfile &lt;span style="color:#f92672">=&lt;/span> pcap_to_csv
)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>At a high level the script performs the following actions:&lt;/p>
&lt;ul>
&lt;li>Spawns a &lt;code>tshark&lt;/code> process which runs over the packet capture, outputting the specified fields in JSON format to STDOUT.&lt;/li>
&lt;li>Reads the JSON from STDOUT and flattens the data structure.&lt;/li>
&lt;li>Outputs all of the fields as CSV.&lt;/li>
&lt;/ul>
&lt;p>Spawning the tshark process and reading from STDOUT is not the cleanest of implementations, but it does the job we need it to do.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">time perl pcap_to_csv sample.pcap
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>## Gathering dissectors...
## Extracting packets...
## Decoding JSON...
## Flattening packets...
## Creating sample.pcap.csv
## perl pcap_to_csv sample.pcap 17.33s user 0.79s system 101% cpu 17.774 total
&lt;/code>&lt;/pre>&lt;p>What&amp;rsquo;s the size differential?&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-zsh" data-lang="zsh">ls -lh sample.* | awk &lt;span style="color:#e6db74">&amp;#39;{ print $5, $9 }&amp;#39;&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>## 9.1M sample.pcap
## 101M sample.pcap.csv
&lt;/code>&lt;/pre>&lt;p>We see there&amp;rsquo;s about a 10:1 size ratio between the CSV and the original packet capture.&lt;/p>
&lt;p>This CSV file is then ingested into R and some mutations are performed:&lt;/p>
&lt;ol>
&lt;li>We remove the &amp;lsquo;.0&amp;rsquo; from the end of variable names. This allows us to refer directly to variables that are only in a frame once, e.g. &lt;code>pcap['tcp.dstport']&lt;/code> instead of &lt;code>pcap['tcp.dstport.0']&lt;/code>.&lt;/li>
&lt;li>The &lt;code>frame.time&lt;/code> field is changed to a &lt;code>POSIXct&lt;/code> date-time class rather than a simple character string.&lt;/li>
&lt;/ol>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#a6e22e">library&lt;/span>(glue)
&lt;span style="color:#a6e22e">library&lt;/span>(tidyverse)
&lt;span style="color:#a6e22e">library&lt;/span>(kableExtra)
&lt;span style="color:#75715e"># Ingest the packet capture&lt;/span>
pcap &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
&lt;span style="color:#a6e22e">glue&lt;/span>(packet_capture, &lt;span style="color:#e6db74">&amp;#34;.csv&amp;#34;&lt;/span>) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">read_csv&lt;/span>(guess_max &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">100000&lt;/span>)
&lt;span style="color:#75715e"># Remove the &amp;#39;:0&amp;#39; from the column names&lt;/span>
&lt;span style="color:#a6e22e">names&lt;/span>(pcap) &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#a6e22e">names&lt;/span>(pcap) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span> &lt;span style="color:#a6e22e">str_remove&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;\\.0$&amp;#39;&lt;/span>)
&lt;span style="color:#75715e"># Update the frame.time column&lt;/span>
pcap &lt;span style="color:#f92672">&amp;lt;-&lt;/span>
pcap &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(frame.time &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">as.POSIXct&lt;/span>(
frame.time_epoch,
tz &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;UTC&amp;#39;&lt;/span>,
origin &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;1970-01-01 00:00.00 UTC&amp;#39;&lt;/span>
))
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Taking a look at some of the key variables in the first 10 rows:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#75715e"># First peek&lt;/span>
pcap &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">select&lt;/span>(
frame.time,
ip.src, ip.dst,
tcp.dstport, tcp.stream
) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">slice&lt;/span>(&lt;span style="color:#ae81ff">1&lt;/span>&lt;span style="color:#f92672">:&lt;/span>&lt;span style="color:#ae81ff">5&lt;/span>)
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>## # A tibble: 5 x 5
## frame.time ip.src ip.dst tcp.dstport tcp.stream
## &amp;lt;dttm&amp;gt; &amp;lt;chr&amp;gt; &amp;lt;chr&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt;
## 1 2011-01-25 18:52:22 192.168.3.131 72.14.213.138 80 0
## 2 2011-01-25 18:52:22 72.14.213.138 192.168.3.131 57011 0
## 3 2011-01-25 18:52:22 192.168.3.131 72.14.213.102 80 1
## 4 2011-01-25 18:52:22 192.168.3.131 72.14.213.138 80 0
## 5 2011-01-25 18:52:22 72.14.213.102 192.168.3.131 55950 1
&lt;/code>&lt;/pre>&lt;h1 id="wireshark-analogies">Wireshark Analogies&lt;/h1>
&lt;p>Now that we&amp;rsquo;ve got our data in to R, let&amp;rsquo;s explore it. To start with we&amp;rsquo;ll emulate some of the native outputs of Wireshark.&lt;/p>
&lt;h2 id="io-graph">I/O Graph&lt;/h2>
&lt;p>This is the default graph you would find by going to [Statistics -&amp;gt; I/O Graph] in Wireshark. We round the each frame&amp;rsquo;s time to the nearest second and tally up the number of frames occurring within each of these seconds.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">pcap &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">group_by&lt;/span>(t &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">round&lt;/span>(frame.time_relative)) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">tally&lt;/span>() &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">ggplot&lt;/span>() &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">geom_line&lt;/span>(&lt;span style="color:#a6e22e">aes&lt;/span>(t, n)) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">labs&lt;/span>(
title &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Total Input/Output&amp;#39;&lt;/span>,
x &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Seconds Since Start of Capture&amp;#39;&lt;/span>,
y &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Frame Count&amp;#39;&lt;/span>
)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2019-12-19-packet-analysis-with-r_files/figure-html/packets_per_second-1.png" width="672" />&lt;/p>
&lt;h2 id="ip-conversations">IP Conversations&lt;/h2>
&lt;p>This is similar to the output you would get by going to [Statistics -&amp;gt; Conversations -&amp;gt; IP]. We group by source and destination IP address and count the number of packets and the number of kilobytes in each of these &lt;em>unidirectional&lt;/em> IP conversations.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">pcap &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">group_by&lt;/span>(ip.src, ip.dst) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">summarise&lt;/span>(
packets &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">n&lt;/span>(),
kbytes &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">sum&lt;/span>(frame.len)&lt;span style="color:#f92672">/&lt;/span>&lt;span style="color:#ae81ff">1000&lt;/span>
) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">arrange&lt;/span>(&lt;span style="color:#a6e22e">desc&lt;/span>(packets)) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">head&lt;/span>()
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>## # A tibble: 6 x 4
## # Groups: ip.src [5]
## ip.src ip.dst packets kbytes
## &amp;lt;chr&amp;gt; &amp;lt;chr&amp;gt; &amp;lt;int&amp;gt; &amp;lt;dbl&amp;gt;
## 1 65.54.95.68 192.168.3.131 1275 1719.
## 2 204.14.234.85 192.168.3.131 1036 957.
## 3 65.54.95.75 192.168.3.131 766 879.
## 4 192.168.3.131 204.14.234.85 740 478.
## 5 192.168.3.131 65.54.95.68 664 66.9
## 6 65.54.95.140 192.168.3.131 658 763.
&lt;/code>&lt;/pre>&lt;h2 id="protocols">Protocols&lt;/h2>
&lt;p>In this graph we&amp;rsquo;re trying to emulate [Statistics -&amp;gt; Protocol Hierarchy]. The &lt;code>frame.protocols&lt;/code> field lists the dissectors used in the frame separated by a colon. A regex is used to extract out the first four dissectors and create a new variable. This variable is grouped variable and count the number of frames for each one.&lt;/p>
&lt;p>We graph the output slightly differently, first flipping the coordinates to that the x-axis runs top to bottom and y-axis runs left to right, then scaling the x-axis logarithmically.&lt;/p>
&lt;p>No surprises that TCP traffic accounts for the most packets, followed by SSL (TLS) and HTTP.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">pcap &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(
first_4_proto &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">str_extract&lt;/span>(frame.protocols, &lt;span style="color:#e6db74">&amp;#39;(\\w+)(:\\w+){0,4}&amp;#39;&lt;/span>)
) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">count&lt;/span>(first_4_proto) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">ggplot&lt;/span>() &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">geom_col&lt;/span>(&lt;span style="color:#a6e22e">aes&lt;/span>(&lt;span style="color:#a6e22e">fct_reorder&lt;/span>(first_4_proto, n), n)) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">coord_flip&lt;/span>() &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">scale_y_log10&lt;/span>() &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">labs&lt;/span>(
title &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Packet Capture Protocols&amp;#39;&lt;/span>,
x &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;First Four Dissectors&amp;#39;&lt;/span>,
y &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Total Frames (Log Scale)&amp;#39;&lt;/span>
)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2019-12-19-packet-analysis-with-r_files/figure-html/protocols-1.png" width="672" />&lt;/p>
&lt;h2 id="packet-lengths">Packet Lengths&lt;/h2>
&lt;p>This graph is a visual representation of [Statistics -&amp;gt; Packet Lengths]. The axis is broken up into bins of 50 bytes, and the height of each bar represents the log of the number of packets seen with a size within that range. The bars are also colourised based on whether the packet is a TCP acknowledgement or not.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">pcap &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(is_ack &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#f92672">!&lt;/span>&lt;span style="color:#a6e22e">is.na&lt;/span>(tcp.analysis.acks_frame)) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">ggplot&lt;/span>() &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">geom_histogram&lt;/span>(&lt;span style="color:#a6e22e">aes&lt;/span>(frame.len, fill &lt;span style="color:#f92672">=&lt;/span> is_ack), binwidth &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">50&lt;/span>) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">labs&lt;/span>(
x &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Number of Frames&amp;#39;&lt;/span>,
y &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Frame Size - Log(Bytes)&amp;#39;&lt;/span>,
fill &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Is ACK Segment?&amp;#39;&lt;/span>
) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">scale_y_log10&lt;/span>()
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2019-12-19-packet-analysis-with-r_files/figure-html/packet_lengths-1.png" width="672" />&lt;/p>
&lt;h1 id="exploratory">Exploratory&lt;/h1>
&lt;p>We&amp;rsquo;ve emulated (to an extent) some of the Wireshark statistical information, let&amp;rsquo;s dig a little deeper and see what else we can discover about this particular packet capture.&lt;/p>
&lt;h2 id="http-hosts">HTTP Hosts&lt;/h2>
&lt;p>Let&amp;rsquo;s explore what HTTP hosts requests are being made to. We filter out all packets without the &lt;code>http.host&lt;/code> field, which contains the value of the &lt;a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Host">Host header&lt;/a>, and count the number of occurrences of each distinct value.&lt;/p>
&lt;p>We see an MSN address topping the list, however interestingly a broadcast address is second.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r"> pcap &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
dplyr&lt;span style="color:#f92672">::&lt;/span>&lt;span style="color:#a6e22e">filter&lt;/span>(&lt;span style="color:#f92672">!&lt;/span>&lt;span style="color:#a6e22e">is.na&lt;/span>(http.host)) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">count&lt;/span>(http.host) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">top_n&lt;/span>(&lt;span style="color:#ae81ff">20&lt;/span>, n) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">ggplot&lt;/span>() &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">geom_col&lt;/span>(&lt;span style="color:#a6e22e">aes&lt;/span>(&lt;span style="color:#a6e22e">fct_reorder&lt;/span>(http.host, n), n)) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">coord_flip&lt;/span>() &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">labs&lt;/span>(
title &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Requests per HTTP Host Header&amp;#39;&lt;/span>,
x &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Host&amp;#39;&lt;/span>,
y &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Number of HTTP requests&amp;#39;&lt;/span>
)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2019-12-19-packet-analysis-with-r_files/figure-html/unnamed-chunk-1-1.png" width="672" />&lt;/p>
&lt;p>Let&amp;rsquo;s dive a little deeper on this - what are the protocols of these multicast HTTP packets?&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">pcap &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
dplyr&lt;span style="color:#f92672">::&lt;/span>&lt;span style="color:#a6e22e">filter&lt;/span>(http.host &lt;span style="color:#f92672">==&lt;/span> &lt;span style="color:#e6db74">&amp;#39;239.255.255.250:1900&amp;#39;&lt;/span>) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">select&lt;/span>(frame.protocols) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">distinct&lt;/span>()
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>## # A tibble: 2 x 1
## frame.protocols
## &amp;lt;chr&amp;gt;
## 1 eth:ethertype:ip:udp:ssdp
## 2 eth:ethertype:ip:icmp:ip:udp:ssdp
&lt;/code>&lt;/pre>&lt;p>We see that it&amp;rsquo;s &lt;a href="https://en.wikipedia.org/wiki/Simple_Service_Discovery_Protocol">SSDP&lt;/a> broadcasting out, as well as other hosts responding with ICMP messages. What are the ICMP messages?&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">pcap &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
dplyr&lt;span style="color:#f92672">::&lt;/span>&lt;span style="color:#a6e22e">filter&lt;/span>(
frame.protocols &lt;span style="color:#f92672">==&lt;/span> &lt;span style="color:#e6db74">&amp;#39;eth:ethertype:ip:icmp:ip:udp:ssdp&amp;#39;&lt;/span>
) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">select&lt;/span>(icmp.type, icmp.code)
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>## # A tibble: 20 x 2
## icmp.type icmp.code
## &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt;
## 1 11 0
## 2 11 0
## 3 11 0
## 4 11 0
## 5 11 0
## 6 11 0
## 7 11 0
## 8 11 0
## 9 11 0
## 10 11 0
## 11 11 0
## 12 11 0
## 13 11 0
## 14 11 0
## 15 11 0
## 16 11 0
## 17 11 0
## 18 11 0
## 19 11 0
## 20 11 0
&lt;/code>&lt;/pre>&lt;p>Type 11 (time exceeded) code 0 (time to live exceeded in transit) messages.&lt;/p>
&lt;h2 id="tls-versions-and-ciphers">TLS Versions and Ciphers&lt;/h2>
&lt;p>Taking more of a security perspective, let&amp;rsquo;s take a look at the SSL/TLS versions and ciphers being used.&lt;/p>
&lt;p>During the TLS handshake, the ClientHello message has two versions: the record version which indicates which version of the ClientHello is being sent, and the handshake version which indicates the version of the protocol the client/server wishes to communicate on during the session. We&amp;rsquo;re concerned with the handshake version:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">pcap &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
dplyr&lt;span style="color:#f92672">::&lt;/span>&lt;span style="color:#a6e22e">filter&lt;/span>(&lt;span style="color:#f92672">!&lt;/span>&lt;span style="color:#a6e22e">is.na&lt;/span>(ssl.handshake.version)) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">count&lt;/span>(ssl.handshake.version)
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>## # A tibble: 2 x 2
## ssl.handshake.version n
## &amp;lt;chr&amp;gt; &amp;lt;int&amp;gt;
## 1 0x00000300 10
## 2 0x00000301 126
&lt;/code>&lt;/pre>&lt;p>The predominant version is TLS 1.1 (0x0301), with some TLS 1.0 (0x0300).&lt;/p>
&lt;p>What about the ciphers being used? By filtering out packets that aren&amp;rsquo;t part of the handshake and selecting the ciphersuite variable we can get an idea.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">pcap &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
dplyr&lt;span style="color:#f92672">::&lt;/span>&lt;span style="color:#a6e22e">filter&lt;/span>(&lt;span style="color:#f92672">!&lt;/span>&lt;span style="color:#a6e22e">is.na&lt;/span>(ssl.handshake.ciphersuite)) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">select&lt;/span>(ssl.handshake.ciphersuite)
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>## # A tibble: 122 x 1
## ssl.handshake.ciphersuite
## &amp;lt;dbl&amp;gt;
## 1 49162
## 2 5
## 3 49162
## 4 49162
## 5 49162
## 6 49162
## 7 49162
## 8 49162
## 9 5
## 10 5
## # … with 112 more rows
&lt;/code>&lt;/pre>&lt;p>Unfortunately we don&amp;rsquo;t get the ciphersuite in a human readable format. Instead we get the the decimal version of the two-byte identification number. This makes it&amp;rsquo;s difficult to make a security judgement on these ciphers.&lt;/p>
&lt;p>Let&amp;rsquo;s translate these into a human readable format. &lt;a href="http://realtimelogic.com/ba/doc/en/C/shark/group__SharkSslCiphers.html">This website&lt;/a> that has a translation table and also - thankfully - has a CSS element that we can use to pull out the values.&lt;/p>
&lt;p>The &lt;code>rvest&lt;/code> library is used to download the page, pull out the table entries, and convert them to text. Each entry is a string with the ciphersuite name and hex separated by spaces, so those are split, and finally the columns are given sensible names.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">&lt;span style="color:#a6e22e">library&lt;/span>(rvest)
s &lt;span style="color:#f92672">&amp;lt;-&lt;/span> &lt;span style="color:#e6db74">&amp;#39;https://tinyurl.com/t74r83x&amp;#39;&lt;/span>
s &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
xml2&lt;span style="color:#f92672">::&lt;/span>&lt;span style="color:#a6e22e">read_html&lt;/span>() &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">html_nodes&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;.memItemRight&amp;#39;&lt;/span>) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">html_text&lt;/span>() &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">str_split_fixed&lt;/span>(&lt;span style="color:#e6db74">&amp;#34;\\s+&amp;#34;&lt;/span>, n &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">2&lt;/span>) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">as_tibble&lt;/span>(.name_repair &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#f92672">~&lt;/span>{ &lt;span style="color:#a6e22e">c&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;ciphersuite&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;hex_value&amp;#39;&lt;/span>) }) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(hex_value &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">as.hexmode&lt;/span>(hex_value)) &lt;span style="color:#f92672">-&amp;gt;&lt;/span>
cipher_mappings
&lt;span style="color:#a6e22e">head&lt;/span>(cipher_mappings)
&lt;/code>&lt;/pre>&lt;/div>&lt;pre>&lt;code>## # A tibble: 6 x 2
## ciphersuite hex_value
## &amp;lt;chr&amp;gt; &amp;lt;hexmode&amp;gt;
## 1 TLS_NULL_WITH_NULL_NULL 0
## 2 TLS_RSA_WITH_NULL_MD5 1
## 3 TLS_RSA_WITH_NULL_SHA 2
## 4 TLS_RSA_WITH_RC4_128_MD5 4
## 5 TLS_RSA_WITH_RC4_128_SHA 5
## 6 TLS_RSA_WITH_DES_CBC_SHA 9
&lt;/code>&lt;/pre>&lt;p>We&amp;rsquo;re only concerned with the ciphersuite the ServerHello responds with, because this is the one that is ultimately used. Thus other records are filtered out, the number of discrete ciphersuites is counted, and the values converted to hex.&lt;/p>
&lt;p>A left join by the hex values is performed which adds the &lt;code>ciphersuite&lt;/code> column to the data. The data is presented as a bar graph, the height of the bar representing the number of times each ciphersuite was used in an TLS connection.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">pcap &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
dplyr&lt;span style="color:#f92672">::&lt;/span>&lt;span style="color:#a6e22e">filter&lt;/span>(ssl.handshake.type &lt;span style="color:#f92672">==&lt;/span> &lt;span style="color:#ae81ff">2&lt;/span>) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">count&lt;/span>(ssl.handshake.ciphersuite) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">mutate&lt;/span>(
cs &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">as.hexmode&lt;/span>(ssl.handshake.ciphersuite)
) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">left_join&lt;/span>(
cipher_mappings,
by &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">c&lt;/span>(&lt;span style="color:#e6db74">&amp;#39;cs&amp;#39;&lt;/span> &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;hex_value&amp;#39;&lt;/span>)
) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">ggplot&lt;/span>() &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">geom_col&lt;/span>(&lt;span style="color:#a6e22e">aes&lt;/span>(ciphersuite, n)) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">coord_flip&lt;/span>() &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">labs&lt;/span>(x &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;TLS Ciphersuite&amp;#39;&lt;/span>, y &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Total TLS Sessions&amp;#39;&lt;/span>)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2019-12-19-packet-analysis-with-r_files/figure-html/tls_ciphers-1.png" width="672" />&lt;/p>
&lt;h2 id="dns-response-times">DNS Response Times&lt;/h2>
&lt;p>Finally, let&amp;rsquo;s take a look at DNS response times. We filter for DNS responses, group by the query and the response type, calculate the mean response time for each of these groups an plot it.&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-r" data-lang="r">pcap &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
dplyr&lt;span style="color:#f92672">::&lt;/span>&lt;span style="color:#a6e22e">filter&lt;/span>(dns.flags.response &lt;span style="color:#f92672">==&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">group_by&lt;/span>(dns.qry.name, dns.resp.type) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">summarise&lt;/span>(mean_resp &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">mean&lt;/span>(dns.time)) &lt;span style="color:#f92672">%&amp;gt;%&lt;/span>
&lt;span style="color:#a6e22e">ggplot&lt;/span>() &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">geom_col&lt;/span>(&lt;span style="color:#a6e22e">aes&lt;/span>(
&lt;span style="color:#a6e22e">fct_reorder&lt;/span>(dns.qry.name, mean_resp),
mean_resp,
fill &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#a6e22e">as.factor&lt;/span>(dns.resp.type)
)) &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">coord_flip&lt;/span>() &lt;span style="color:#f92672">+&lt;/span>
&lt;span style="color:#a6e22e">labs&lt;/span>(
x &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;DNS Query Name&amp;#39;&lt;/span>,
y &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;Mean Response Time (seconds)&amp;#39;&lt;/span>,
fill &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;DNS Response Type&amp;#39;&lt;/span>
)
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;img src="https://clt.blog.foletta.net/post/2019-12-19-packet-analysis-with-r_files/figure-html/unnamed-chunk-2-1.png" width="672" />&lt;/p>
&lt;p>What we see that response time for some domains differs. The shorter response times have a low variance, indicating they likely came from the resolver&amp;rsquo;s cache. Other responses have a higher variance, either because of network latency to authoritative DNS servers, or because other DNS resolvers in the chain (opaque to us) have the entry cached as well.&lt;/p>
&lt;h1 id="summary">Summary&lt;/h1>
&lt;p>In this article I&amp;rsquo;ve discussed the conversion of packet capture files to CSV, and exploratory data analysis of these packet captures. I hope I&amp;rsquo;ve shown a how standard packet capture analysis with Wireshark can be complimented by analysis with the R language.&lt;/p></description></item></channel></rss>