Showing posts with label python. Show all posts

Monday, 7 April 2025

Talk about Cloud Prices at PyConLT 2025

Introduction to Cloud Pricing

I am looking forward to speaking at PyConLT 2025.
My talk is called Cutting the Price of Scraping Cloud Costs

Its been a while (12 years!) since my last Python conference EuroPython Florence 2012, when I spoke as a Django web developer, although I did give a Golang talk at Kubecon USA last year.

I work at EDB, the Postgres company, on our Postgres AI product. The cloud version of which runs across the main cloud providers, AWS, Azure and GCP.

The team I am in handles the identity management and billing components of the product. So whilst I am mainly a Golang micro-service developer, I have dipped my toe into Data Science, having rewritten our Cloud prices ETL using Python & Airflow. The subject of my talk in Lithuania.

Cloud pricing can be surprisingly complex ... and the price lists are not small.

The full price lists for the 3 CSPs together are almost 5 million prices - known as SKUs (Stock Keeping Unit prices)

csp x service x type x tier x region
3 x 200 x 50 x 3 x 50 = 4.5 million

csp = AWS, Azure and GCP

service = vms, k8s, network, load balancer, storage etc.

type = e.g. storage - general purpose E2, N1 ... accelerated A1, A2 multiplied by various property sizes

tier = T-shirt size tiers of usage, ie more use = cheaper rate - small, medium, large

region = us-east-1, us-west-2, af-south-1, etc.

We need to gather all the latest service SKU that our Postgres AI may use and total them up as a cost estimate for when customers are selecting the various options for creating or adding to their installation.
Applying the additional pricing for our product and any private offer discounts for it, as part of this process.

Therefore we needed to build a data pipeline to gather the SKUs and keep them current.

Previously we used a 3rd party kubecost based provider's data, however our usage was not sufficient to justify for paying for this particular cloud service when its free usage expired.

Hence we needed to rewrite our cloud pricing data pipeline. This pipeline is in Apache Airflow but it could equally be in Dagster or any other data pipeline framework.

My talk deals with the wider points around cloud pricing, refactoring a data pipeline and pipeline framework options. But here I want to provide more detail on the data pipeline's Python code, its use of Embedded Postgres and Click, and the benefits for development and testing. Some things I didn't have room for in the talk.

Outline of our use of Data Pipelines

Airflow, Dagster, etc. provide many tools for pipeline development.
Notably local development mode for running up the pipeline framework locally and doing test runs.
Including some reloading on edit, it can still be a long process, running up a pipeline and then executing the full set of steps, known as a directed acyclic graph, DAG.

One way to improve the DEVX is if the DAG step's code is encapsulated as much as possible per step.
Removing use of shared state where that is viable and allowing individual steps to be separately tested, rapidly, with fixture data. With fast stand up and tear down, of temporary embedded storage.

To avoid shared state persistence across the whole pipeline we use extract transform load (ETL) within each step, rather than across the whole pipeline. This enables functional running and testing of individual steps outside the pipeline.

The Scraper Class

We need a standard scraper class to fetch the cloud prices from each CSP so use an abstract base class.

from abc import ABC

class BaseScraper(ABC):
   """Abstract base class for Scrapers"""
   batch = 500
   conn = None
   unit_map = {"FAIL": ""}
   root_url = ""

   def map_units(self, entry, key):
       """To standardize naming of units between CSPs"""
       return self.unit_map.get(entry.get(key, "FAIL"), entry[key])

   def scrape_sku(self):
       """Scrapes prices from CSP bulk JSON API - uses CSP specific methods"""
       Pass

   def bulk_insert_rows(self, rows):
       """Bulk insert batches of rows - Note that Psycopg >= 3.1 uses pipeline mode"""

       query = """INSERT INTO api_price.infra_price VALUES
       (%(sku_id)s, %(cloud_provider)s, %(region)s, … %(sku_name)s, %(end_usage_amount)s)"""

       with self.conn.cursor() as cur:
           cur.executemany(query, rows)

This has 3 common methods:
mapping units to common ones across all CSP
Top level scrape sku methods some CSP differences within sub methods called from it
Bulk insert rows - the main concrete method used by all scrapers
To bulk insert 500 rows per query we use Psycopg 3 pipeline mode - so it can send batch updates again and again without waiting for response.
The database update against local embedded Postgres is faster than the time to scrape the remote web site SKUs.

The largest part of the Extract is done at this point. Rather than loading all 5 million SKU as we did with the kubecost data dump, to query out the 120 thousand for our product. Scraping the sources directly we only need to ingest those 120k SKU. Which saves handling 97.6% of the data!

So the resultant speed is sufficient although not as performant as pg_dump loading which uses COPY.

Unfortunately Python Psycopg is significantly slower when using cursor.copy and it mitigated against using zipped up Postgres dumps. Hence all the data artefact creation and loading simply uses the pg_dump utility wrapped as a Python shell command.
There is no need to use Python here when there is the tried and tested C based pg_dump utility for it that ensures compatibility outside our pipeline. Later version pg_dump can always handle earlier Postgres dumps.

We don't need to retain a long history of artefacts, since it is public data and never needs to be reverted.
This allows us a low retention level, cleaning out most of the old dumps on creation of a new one. So any storage saving on compression is negligible.
Therefore we avoid pg_dump compression, since it can be significantly slower, especially if the data already contains compressed blobs. Plain SQL COPY also allows for data inspection if required - eg grep for a SKU, when debugging why a price may be missing.

Postgres Embedded wrapped with Go

Unlike MySQL, Postgres doesn't do in memory databases. The equivalent for temporary or test run database lifetime, is the embedded version of Postgres. Run from an auto-created temp folder of files.
Python doesn’t have maintained wrapper for Embedded Postgres, sadly project https://github.com/Simulmedia/pyembedpg is abandoned 😢

Hence use the most up to date wrapper from Go. Running the Go binary via a Python shell command.
It still lags behind by a version of Postgres, so its on Postgres 16 rather than latest 17.
But for the purposes of embedded use that is irrelevant.

By using separate temporary Postgres per step we can save a dumped SQL artefact at the end of a step and need no data dependency between steps, meaning individual step retry in parallel, just works.
The performance of localhost dump to socket is also superior.
By processing everything in the same (if embedded) version of our final target database as the Cloud Price, Go micro-service, we remove any SQL compatibility issues and ensure full Postgresql functionality is available.

The final data artefacts will be loaded to a Postgres cluster price schema micro-service running on CloudNativePG

Use a Click wrapper with Tests

The click package provides all the functionality for our pipeline..

> pscraper -h

Usage: pscraper [OPTIONS] COMMAND [ARGS]...

price-scraper: python web scraping of CSP prices for api-price

Options:

-h, --help Show this message and exit.

Commands:

awsscrape Scrape prices from AWS

azurescrape Scrape prices from Azure

delold Delete old blob storage files, default all over 12 weeks old are deleted

gcpscrape Scrape prices from GCP - set env GCP_BILLING_KEY

pgdump Dump postgres file and upload to cloud storage - set env STORAGE_KEY
> pscraper pgdump --port 5377 --file price.sql

pgembed Run up local embeddedPG on a random port for tests

> pscraper pgembed

pgload Load schema to local embedded postgres for testing

> pscraper pgload --port 5377 --file price.sql

This caters for developing the step code entirely outside the pipeline for development and debug.
We can run pgembed to create a local db, pgload to add the price schema. Then run individual scrapes from a pipenv pip install -e version of the the price scraper package.

For unit testing we can create a mock response object for the data scrapers that returns different fixture payloads based on the query and monkeypatch it in. This allows us to functionally test the whole scrape and data artefact creation ETL cycle as unit functional tests.

Any issues with source data changes can be replicated via a fixture for regression tests.

class MockResponse:
"""Fake to return fixture value of requests.get() for testing scrape parsing"""

name = "Mock User"
payload = {}
content = ""
status_code = 200
url = "http://mock_url"

def init(self, payload={}, url="http://mock_url"):
self.url = url
self.payload = payload
self.content = str(payload)

def json(self):
return self.payload

def mock_aws_get(url, **kwargs):
"""Return the fixture JSON that matches the URL used"""
for key, fix in fixtures.items():
if key in url:
return MockResponse(payload=fix, url=url)
return MockResponse()

class TestAWSScrape(TestCase):
"""Tests for the 'pscraper awsscrape' command"""

def setUpClass():
"""Simple monkeypatch in mock handlers for all tests in the class"""
psycopg.connect = MockConn
requests.get = mock_aws_get
# confirm that requests is patched hence returns short fixture of JSON from the AWS URLs
result = requests.get("{}/AmazonS3/current/index.json".format(ROOT))
assert len(result.json().keys()) > 5 and len(result.content) < 2000

A simple DAG with Soda Data validation

The click commands for each DAG are imported at the top, one for the scrape and one for postgres embedded, the DAG just becomes a wrapper to run them, adding Soda data validation of the scraped data ...
def scrape_azure():
   """Scrape Azure via API public json web pages"""
   from price_scraper.commands import azurescrape, pgembed
   folder, port = setup_pg_db(PORT)
   error = azurescrape.run_azure_scrape(port, HOST)
   if not error:
       error = csp_dump(port, "azure")
   if error:
       pgembed.teardown_pg_embed(folder)
       notify_slack("azure", error)
       raise AirflowFailException(error)

   data_test = SodaScanOperator(
       dag=dag,
       task_id="data_test",
       data_sources=[
           {
               "data_source_name": "embedpg",
               "soda_config_path": "price-scraper/soda/configuration_azure.yml",
           }
       ],
       soda_cl_path="price-scraper/soda/price_azure_checks.yml",
   )
   data_test.execute(dict())
   pgembed.teardown_pg_embed(folder)

We setup a new Embedded Postgres (takes a few seconds) and then scrape directly to it.

We then use the SodaScanOperator to check the data we have scraped, if there is no error we dump to blob storage otherwise notify Slack with the error and raise it ending the DAG

Our Soda tests check that the number of and prices are in the ranges that they should be for each service. We also check we have the amount of tiered rates that we expect. We expect over 10 starting usage rates and over 3000 specific tiered prices.

If the Soda tests pass, we dump to cloud storage and teardown temporary Postgres. A final step aggregates together each steps data. We save the money and maintenance of running a persistent database cluster in the cloud for our pipeline.

Sunday, 9 June 2024

Software Development with Generative AI - 2024 Update

Why write an update?

I wrote a blog post on Software Development with Generative AI last year, which was questioning the approach of the current AI software authoring assistants. I believe the bigger picture holds true that to fully utilize AI to write software, will require an entirely different approach. Changing the job of a software developer in a far more radical manner and perhaps making many of today's software languages redundant.

However I also raised the issue that I found the current generative AI helpers utility questionable for seasoned developers:

"The generative AI can help students and others who are learning to code in a computer language, but can it actually improve productivity for real, full time, developers who are fluent in that language?
I think that question is currently debatable... (but it is improving rapidly) ... We may reach that point within a year or two"

Well it hasn't been a year or two, just 6 months. But I believe the addition of the Chat window to CoPilot and an improvement in the accuracy of its models has already made a significant difference.

On balance I would now say that even a fluent programmer may get some benefits from its use. Given the speed of improvement it is likely that all commercial programming will use an AI assistant within a few years.

To delay the inevitable and not embed it in to your work process is like King Canute commanding the sea to retreat. There are increasing numbers of alternatives available too. However as the market leader I believe it is worth going in to slightly more depth as to the current state of play with CoPilot.

Github Copilot Features

The new Chat window within your IDE gives you a context sensitive version of Copilot ChatGPT that can act as a pair programmer and code reviewer for your work.

If you have enabled auto-complete then you instigate that usage by writing functional comments, ie prompts then tabbing out to accept the suggestions it responds with.

To override these prompts, you instead can use dot and get real code completion options (as long as your IDE is configured correctly). Since code completion has your whole codebase as context, it complements CoPilot reasonably well. But whilst the code completion is always correct, CoPilot is less so, probably more like 75% now compared to its initial release level of 50%

It takes some time to improve the quality of your prompting. An effort must be made to eradicate any nuance, assumption, implication or subtlety from your English. Precise mechanical instructions are what are required. However its language model will have learnt common usage. So if you ask it to sort out your variables it will understand that you mean replace all hardcoded values in the body of your code with a set of constants defined at the top, explain that is what it thinks you mean and give you the code that does that.

You can ask it anything about the usage of the language you are working in, how something should be coded, alternatives to that etc. So taking a pair programming approach and explaining what you are about to code and why to CoPilot chat as you go, can be very useful. Given rubber duck programming is useful, having an intelligent duck that can answer back ... is clearly more so.

It excels as a learning tool, largely replacing Googling and Stack Overflow with an IDE embedded search for learning new languages. But even for a language you know well, there can be details and nuances of usage you have overlooked or changes in syntactic standards with new releases you have missed.

You can also ask it to give your file a code review. Where it will list out a series of suggested refactors that it judges would improve it.

Copilot Limitations

Currently however there are many limitations, understanding them, helps you know how to use CoPilot and not turn it off in frustration at its failings!

The most important one is that CoPilot's context is extremely limited. There is no RAG enhancement yet, no learning from your usage. It may seem to improve with usage, but that is just you getting better at using it. It does not learn about you and your coding style as you might expect, given a dumb shopping site does that as standard.

It does not create a user context for you and populate it with your codebase. It simply grabs the content of the currently edited file and the Chat prompt text and the language version for the session as a big query. The same for the auto-suggestion. But here the chat text is from the comments or doc strings on the lines preceding.

Posting the lot to a fixed CoPilot LLM that is some months out of date. Although apparently it has weekly updates from continuous retraining.

This total lack of context can mean the only way you can get CoPilot to suggest what you actually want is to write very detailed prompts. It is often simpler to just cut and paste example code as comments into the file - please rewrite blah like this ... paste example. Since only if its in the file or latest Chat question will it get posted to inform the response.

At the time of writing CoPilot is due to at least retain and learn from Chat window history to extend its context a little. But currently it only knows about the currently open file and latest Chat message. Other providers have tools that do load the whole code base, for example Cody, plus there are open source tools to post more of your code base to ChatGPT or to an open source LLM.

As this blog post update indicates, the whole area is evolving at an extremely rapid pace.

The model it has for a language is fixed and dated. Less so for the core language but for example you may use a newer version of the leading 3rd party Postgres library that came out 2 years ago. But the majority of users are still on the previous one since it is still maintained. Their syntax differs. Copilot may only know the syntax for the old library because that is what it was trained with, even though a later version is being imported in the file, so is in Copilot's limited context. So any chat window or code prompts it suggests will be wrong.

I have yet to find it brings up anything useful that I didn't know about the code when using the code review feature, plus the suggestions can include things that are inapplicable or already applied. But I am sure it would be more useful for learning a new language.

AI prompting and commenting issue

Good practise for software teams around code commenting are that you should NOT stick in functional comments that just explain what the next few lines do. The team are developers and they can read the code as quickly for its base functionality. Adding lots of functional commenting makes things unclear by excessive verbosity.
It is something that is only done for teaching people how to code in example snippets. It has no place in production code.

Comments should be added to give wider context, caveats, assumptions etc. So commenting is all about explaining the Why, not the How.

Doc strings at the head of methods and packages can contain a summary of what the function does in terms of the codebase. So more functional in orientation, but as a big scale summary. So again they are a What not a How.

It looks like current AI assistants may mess that up. Since they need comments that are basically as close to pseudo code as possible. Adding information about real world issues, roadmap, wider codebase, integration with other services ... ie all the Why is likely to confuse them and degrade the auto-complete.

Unfortunately code comments are not AI prompts for generating code and vice versa.
Which suggests that you may want to write a temporary prompt as a comment to generate the code, then replace it with a proper comment once it has served its purpose.

Or otherwise introduce a separate form of hideable prompt marked comment that make it clear what is for the AI and what is for the Human!

Alternatively use the chat window for code generation then paste it in.

Copilot Translation

Translation is an area where Copilot can be very beneficial. As a non-native English speaker you can interact with it in your own language for prompting and comments and it will handle that and translate any comments in the file to English if asked to.

Code translation is more problematic, since the whole structure of a program and common libraries can be different. But if the code is doing some very encapsulated common process. For example just maths operations, or file operations. It can extract the comments and prompts and regenerate the code into another language for you.

One can imagine that one day the only language anyone will need will be a very high level, succinct English-like language, eg. Python.
When you want to write in a verbose or low-level language. You just write the simpler prompts in a spoken language, but use Python when it is faster to communicate explicitly than spoken. Since spoken languages are so unsuited to creating machine instructions.
Press a button and Copilot turns the lot into verbose C or Java code with English comments.

Saturday, 27 April 2024

Software Engineering Hiring and Firing

The jump in interest rates to the highest level in over 20 years that hit in summer 2023 for the US, UK and many other countries is still impacting the Software industry. Rates may be due to drop soon, but currently it has choked off investment, upped borrowing costs and lead to many software companies making engineers redundant to please the markets.

For the UK the estimate is around 8% of software industry jobs made redundant. Although strangely, the overall trend in vacancies for software engineers continues to march upwards, the initial surge after the pandemic dipped last summer but has now recovered.
But if you work in the industry you are bound to have colleagues and friends who have been made redundant, if you are lucky enough to have not been impacted personally.

Given recent history, I thought it may be worth reflecting on my personal experience of the whole hiring and firing process, in the tech industry. It is a UK centric view, but the companies I have worked for in the last 8 years are US software companies.

I have been fired, hired and conducted technical interviews to hire others. Giving me a few different perspectives.

This post is NOT about getting your first Software job

I first got a coding job in the public sector and it was as a self taught web developer in the 1990s, before web development was a thing you could get a degree in. So I initially got a job in IT support, volunteered to act up (ie no pay increase) and built some websites that were needed, then became a full time web developer through a portfolio of work, ie sites.

Today junior developers may have to prove themselves suitable by artificial measures. I skipped these, so I do not have any professional certifications, or any to recommend. I also don't know how to ace coding algorithm or personality profile assessments.

Once you are 5-10 years in to a software career - none of those approaches are used for hiring decisions.

Only large companies are likely to subject you to them, and that is really out of fairness on the juniors who have to go through them, and to screen out dodgy applicants. Screening just needs to be passed, it will not have any input into whether you get the job. Hence Acing the coding interview as promoted by sites such as LeetCode is not even a thing, only passing coding exercise systems in order to start, or switch to, a career as a developer. I would recommend starting an open source project instead, to demonstrate you can actually code.

The majority of small to medium software companies and of job vacancies require experience and in effect have no vacancies for the most junior software grades with less than 3 years under their belt. So they tend not to use any of these filtering methods. They just want to see proof that you are already a developer, and usually base that on face to face interviews and examples of your code you provide them. So much like how I was originally hired back in the 1990s.

I have only been subject to a LeetCode style test once, which was for a generic job application, ie hiring for numbers of SREs of various seniority, for a FANNG.

F I R E D

When you get that unexpected one to one Zoom call with you manager appear in your calendar these days, it is unlikely to be great news 😓

In the majority of cases the firing process, or to be more polite, redundancy, is all about balancing the finances of the whole company or institution. As such it is very unlikely to be about you.

Of course people are also fired as individuals for various reasons, one of which is actually not being any good at their job, failing to get along with their manager, being a bad culture fit, jobs turning out not to be what was advertised, or expressing political views. Since unlike where I once worked, in the UK Education sector, where 50% of staff are union members, US software companies will have less than 1% membership, so don't tend to respond well to dissent.

Mostly this happens via failing probation, at around 15% then maybe another 5% annually for disciplinary / performance improvement failure.

If you want to try getting individually fired then go the overemployed route. Get two or three jobs at once and test how long it takes before the company notices you giving 110% is now only 40% and fire you. The rule of thumb is the larger the company, the longer it takes!

But this post's focus isn't about individual firing, its about organizational hiring and firing.

Firing Reasons

A company may be doing badly in a slow long term way, so it has to chop as part of a restructure and downsize to attempt to fix that.
Alternatively the company could be doing really well. So it gets the attention of a big investment company and is bought up and merged with its rival. To fix overlap and justify the merger - both companies lose 20% of staff.
Maybe it needs to pivot towards a new area (currently likely to be AI) and so chop 20% of its staff so it can hire 15% experienced, and pricey, AI developers.
Or it may just have had a one off external impacting event that hit it financially. So to balance the earnings for that year and keep its share price good, it chops a bunch of staff. It will rehire next year, when it suits the balance sheet. This is the example in which I was made redundant along with 5% of staff, it was a big company, so that was a few thousand people globally.
Finally it may be an industry wide phenomenon as it is with the current redundancies in the software industry. A world clamp down on easy cheap loans means investment company driven industries such as tech. are no longer awash with spare cash. Cut backs look good right now, and keep the share price high.
Hence redundancies that are nothing to do with the industry itself or its future prospects.

That is mirrored in who is fired. Companies do not keep a log book of gold stars and black marks against each employee. They do not use organizational triggered rounds of redundancies to select individuals to fire. They certainly have not got the capability to accurately determine all the best employees and only fire the worst ones. You will be fired based on what part of the organization you are in, how much it is valued in the current strategy and how much you cost vs others who could do your job. If you are currently between teams / or in a new team or role which has yet to establish itself, when the music stops - like musical chairs, bad luck you are out.

The only personal element may be if a whole team is seen as under performing or difficult to manage it might be axed. No matter that it contains a star performer. Decisions may also be geographic. Lets axe the Greek office, save by withdrawing engineering from a country, which is again how I was made redundant, the rest of my team was in Greece.
Alternatively it may be, fire under 20 staff from each country, to avoid more burdensome regulation for bulk layoffs.

The organization could create an insecure / downturn atmosphere to encourage staff to leave. Because its a lot cheaper for people to leave than the company paying out redundancy settlements.

Redundancy keeps the average employees 😐

As a result in response to significant redundancies an organisation will tend to lose more of the best employees - since they are the most able to move, the most likely to get a big pay rise if they move and the least likely to want to stick around if they see negative organisational change. The software industry has very high staff turnover at almost 20%. Out weighing any nominal idea of removing less efficient staff.

If a company handles things well it may only lose a representative productivity range of staff from best to worst. But a bulk redundancy process is likely to lead to the biggest loss in the top talent, get rid of slightly more of the bottom dwellers and so result in maximising the mediocre!

In summary the answer to 'Why me?' in group redundancies is "because you were there" ... and you didn't have a personal friendship with the CEO 😉. Of course that is why new CEOs are often brought in to restructure - the first step of which is to take an axe to the current C-suite.

Some of the best software engineers I have worked with have been made redundant at some point in their career. Group redundancies are not about you or how well you do your job. But taking it personally and challenging the messenger, with why me?, as demonstrated by recent viral videos, is an understandable emotional response to rejection, and the misguided belief that work aims to be some form of meritocracy, in the same way college might.

LIFO and FIFO

LIFO rather than FIFO is the norm in Firing. New hires are less likely to have established themselves as essential to the company, and have less personal connections within it. More importantly many countries redundancy legislation doesn't kick in until over 2 years of employment and the longer you have been employed the more the company will have to pay to terminate you.

Which means a new hire who has uprooted for their new tech job, will be the most likely to find themselves losing that job when bulk redundancies hit.
But FIFO has its place, next would be older engineers. Some companies don't even hire hands on engineers much over the age of 40, anyway. But staff near retirement have at most only a few years left to contribute and may cost more for the same grade. So encouraging early retirement can be part of the bulk redundancy process.

Prejudicial Firing

Whilst redundancy is all about costs and not about your personal performance. That is not to say companies who pass the redundancy choices down to junior managers may not end up with firing disproportionate numbers of workers who are not from the same background as their manager, ie white USA males, ideally younger than the manager. But prejudice is not personal either. That is pretty much what defines it as prejudice, a pre-judgement of people based on physical characteristics rather than their ability at the job. Also people are least likely to fire staff that they have the most in common with, resulting in prejudicial firing.
Unfortunately it seems many companies with a good diversity policy for hiring, may not have adequate ones for firing. Again resulting in losing more of the higher performing staff.

I have heard of a case where someone got a new manager, who on joining was told to cut from his team, so he fired everyone outside the USA. The worker was so keen to stay at their current employer they went over the head of their manager to senior management and asked for their redundancy to be repealed. Since they had been at the company many years and personally knew senior management, this worked.
Alternatively a more purely cost based restructure may hire all developers from cheaper countries and fire most of them in the US. As happened with Google's Python team recently.

Fight for your job?

The company may set up a pool process for bulk redundancies if numbers are high enough per country, where you can fight for a place on the lifeboat of remaining positions.

In both cases I would recommend that you don't waste time on a company that doesn't value you. If you do stay you risk, dealing with the bulk redundancy aftermath. Which will be present unless the redundancies were for a pivot (3) or one off event (4).
An increased workload, pay freezes, no bonus, needing to over work to justify being kept on, plus a negative work atmosphere.

In a case where I stayed after the department I was in was axed, I had to reapply for a new job which was moved to a different division. The work was less worthwhile and at the time, the employment of in-house software developers as a whole, was questioned as being unnecessary for the organisation. I outstayed my welcome for 18 months of legacy commercial software support, before getting the message and quitting.

Lesson learnt, if you must ask to to stay in your company, via senior management, a pool or reapplication. Make sure you look around and apply for other jobs outside of it at the same time.

You also miss out on a minimum of a couple of months tax free pay as a settlement.

On the other hand, if the redundancy round is for a more minor pivot, and you are happy in the role, it may be well worth staying around to see how things pan out.

Of course you may get no choice in the matter, in which case, get straight into GET HIRED mode, and start the job search. If you can manage it fast enough, you will benefit financially from the whole process. Although if the reason is (5) a sector wide reduction, then it will be take longer and be harder to obtain the usual 20% pay increase that a new position can offer.

H I R E D

Why change jobs (aside from being fired!)

It is a lot easier to get a pay rise or promotion by changing companies, than being promoted internally. To fast track your career to a principal or architect top IC role. Or just get a pay rise.
Changing jobs gives you much wider experience, of different technology, approaches and cultures. Making you a better engineer.
If you have been in your current job over 10 years without significant internal promotions or changes of role then it is detrimental to your CV, indicating you are stuck in a rut and unable to handle change, eg. new technology.
You want to shift sectors.
I changed from public sector web developer, to commercial cloud engineer with one move.
You want to get into new technology that is not used in your current role.
I changed from a Python, Ruby config management automation engineer to a Kubernetes Golang engineer with another.
You want to change your role in tech, or leave it entirely. For example get out of sales as a solution architect and back into a more technical role as an SRE.

On that basis many software engineers change jobs every 2 or 3 years for part of their careers. Its expected, the average engineer in a FAANG stays less than 3 years.

Of course you probably need to be in a job for at least 2 years to fully master it.
If your CV has loads of similar positions where you barely make it past the probation period, its marking you out as a failure at those roles == Fail hiring at the first step, the HR CV check.

Upskilling

The other problem is that changing jobs to change roles, even if its just to use a new language or framework can be blocked by roles requiring experience in that area on the CV to get interviewed for the job in the first place. For software engineering that is less of an issue. Since tech changes faster than any other sector.
You just need to prove you have a range of experience and software languages and are willing to learn, early in a technology boom. To catch the cloud engineer bus, I got a job in it in 2016. The US cloud sector was $8 billion back then. It is $600 billion now. Similarly to get on board with Golang and Kubernetes in 2019. In the first few years of a tech boom most companies will initially have to cross train engineers without direct experience. The corollary of that is that in the current downturn attempting to pivot to an established technology, which k8s has become, is going to be much harder.

Market rates

Clearly ML ops and AI data science are current booming areas. The demand so far outstrips current supply that for switching to a more junior Python AI role in them may pay as well as a senior Django web developer for example.

So around £60k for a junior role, but in 3-4 years it should jump to at least £100k for a senior AI engineer. Of course for US salaries add 30%, plus usually free medical, life insurance etc. The lower tax rates cancels out the higher cost of living in the US ... so its UK salary +30% in real terms*. Researching the going rate for the particular role, technical skills and sector you are applying for is a necessary part of the hiring process. In order that you don't let recruitment bargain you down too low.

* Note that geographic software pay differences are why you often come across engineers of other nationalities emigrating to, and working in the higher paying countries, USA, Canada and Australia. I have worked with many people from the UK and Europe who live in the USA, and Indians who live there or Europe, for example.
Of course as a cheap foreign worker myself, I too stick with US companies partly because they pay rather more than UK ones, even if a lot lower than what I would get if I moved there 😉

Now is the time when such a switch will be easier to accomplish without having to work nights doing courses, certifications and personal projects. The usual means of demonstrating your ability without any work experience.

The caveat here is that moving jobs in a down turn, as we are arguably experiencing currently, can depress the market salary rates and if you are already at the top of those when made redundant, can mean you have to take a pay cut for a year or two rather than face the cost of long term unemployment.

The hiring process for an experienced software engineer role

Interview to Offer should take a month.

If not then the recruitment is likely for a group of roles in an expansion process and from screening and CV, you are not one of the top candidates. You may waiting on the backlog of potential interviewees for a couple of extra months before it properly kicks off.
Or you are told the post is no longer available, sorry!
Even if you would eventually get a post it may stretch your redundancy settlement. Therefore I would not bother pursuing any application process that is looking to be stretching on past 6 weeks.
Start date will be 5 weeks from contract (partly to cater for notice, referee and compliance checks etc)

That makes it 2 months minimum from applying for a role to starting.

The process will consist of a technical assessment task and at least 3 interviews, screening, manager and technical.
With another for introduction to team mates / office etc. which is unlikely to have any effect on the hiring decision unless you and your potential new manager take an instant personal dislike to each other.

The HR screening interview, just checks you are a genuine candidate for the job.

The Manager interview similarly is more about checking you will fit in with the company and team, plus that you have basic personal communication skills.

The Technical Interview is what matters

Passing the technical interview is what really decides whether you will get a job offer. Sometimes the tech interview may be split into two, one more task and questionnaire based and the other more discussion. Often the initial task part will be given as WFH.

The technical interview will consist of technical questions to explore whether you have the knowledge and experience required, plus some thing to confirm you can write code and discuss that code, for a developer or SRE role. For the former it would likely be application code whilst for the latter automation code.
For a more purely system administration / IT support role it will involve specifying your processes for resolving issues.

If you are unlucky and it is an in person interview, you may have to whiteboard pseudo code live in response to a changing task described to you on the spot. Although I have only had that once. More common, especially for hybrid / remote roles, is the take away task. To be completed in a 'few hours' at most.

It is possible that either of the above could be replaced by another source for your code. Talking through one of your open source packages, if you have any. Or talking through one or two longer automated coding exercise assessed tasks. I have never come across either of these though.

The main point is that the core of any technical interview for a developer related role will involve talking through code you have written, as a kicking off point to check your understanding of the code, how it could be improved, how you would tackle scaling, or a new exemplar functional requirement. Its faults and features.

You will be asked to talk through past code or technical work in a more generic manner in response to standard questions along the lines of examples of your past work that show how you fit the job.

Preparing for the Technical Interview

It doesn't take much to work out that a 20% pay rise is worth, a day's worth of work a week.
Assuming you stay in your new job for 2 or 3 years - that is equivalent to 6 months pay.
On that basis even doing a week of work to apply to, and prepare for a single job is still very well worth it, if you get the job.

Adopting a scatter gun approach, ie applying with a generic CV and covering letter to 10 or more jobs, is a waste of time in my view. If you need a new job, then it should be one you are genuinely interested in and research. That means probably you should only have a maximum of 3 tailored applications on the go at once. Even when I was made redundant (and about to get married) I think I limited myself to 4 job applications in total, with one primary one that thankfully I did end up getting.

There are many sites that can advise how best to do that, based on the framework that your hiring will be decided upon I have outlined above. I think preparing some Challenge Action Result stories targeted at the details of the new employer is useful. Plus spending a day or so refining that '2 hours' development task. Researching the company and preparing specific questions and perhaps suggestions for your interviewers.

Being a Technical Interviewer

From the other side of the table, clearly candidates need to show sufficient competency for the post. They may show it, but only within a totally different technical stack. Smaller companies tend to have less capacity and time to get people up to speed with new tech. So will likely fail these candidates even though they are capable of doing the job eventually.
The technical interviewers will tend to pair on the assessment - to improve its consistency. Swapping partners for interviews regularly also helps.

The assessment process is likely to use some online system such as Jobvite or Greenhouse where each interviewer assesses the candidate. Finally summarising it all with a recommendation for strong pass, pass or fail. Sometimes for a specific post and grade, otherwise the assessment can include a grade recommendation. The manager then rubber stamps that assuming appropriate funding is available. HR's job is to beat the candidate down to the lowest reasonable price, without going so low the candidate walks away.

A healthy growing company will tend to have a rolling recruitment process as they expect to be increasing head count in proportion to customers and revenue. On that basis they will likely be aiming to recruit anyone with a good pass, plus maybe most of the passes too.

Given that engineering jobs are highly specialised and require relevant experience I have not seen cases of way more interviewees than jobs. Currently, even with all the redundancies, there is still an under supply of engineers.
Also the approach for HR will be to set experience and skills pre-requisites for the roles that will keep the numbers down for those who make it to technical interview to around double the number of vacancies. Since it takes out a day of work for each interviewing engineer, to prep, interview and assess.

HIRING SUMMARY

You must pass each of the first 5 or 6 steps to get to the next one and get the job.

HR check written application is a plausible candidate
FAANG sized company - automated quiz / Leetcode style challenge - to reduce the numbers - because they get way more speculative applicants.
Recruiter chat to check candidate is genuine and available
CV skills / experience check vs other applicants to shortlist those worth interviewing
Technical task could be takeaway or whiteboard / questionairre interview
Technical interview, in person or Zoom with engineers
Manager interview / introduction to team mates.
Recruiter chat. Negotiate exact salary. Agree start date.
Contract is signed, YOU ARE HIRED.

Ed Crewe

Ed Crewe Home

Ed Crewe home | software | projects | art

Monday, 7 April 2025

Talk about Cloud Prices at PyConLT 2025

Introduction to Cloud Pricing

csp x service x type x tier x region
3 x 200 x 50 x 3 x 50 = 4.5 million

Outline of our use of Data Pipelines

The Scraper Class

Postgres Embedded wrapped with Go

Use a Click wrapper with Tests

The click package provides all the functionality for our pipeline..

A simple DAG with Soda Data validation

Sunday, 9 June 2024

Software Development with Generative AI - 2024 Update

Why write an update?

Github Copilot Features

Copilot Limitations

AI prompting and commenting issue

Copilot Translation

Saturday, 27 April 2024

Software Engineering Hiring and Firing

This post is NOT about getting your first Software job

F I R E D

H I R E D

About Me

Pages

Ed Crewe Home

Ed Crewe home | software | projects | art

Monday, 7 April 2025

Talk about Cloud Prices at PyConLT 2025

Introduction to Cloud Pricing

csp x service x type x tier x region3 x 200 x 50 x 3 x 50 = 4.5 million

Outline of our use of Data Pipelines

The Scraper Class

Postgres Embedded wrapped with Go

Use a Click wrapper with Tests

The click package provides all the functionality for our pipeline..

A simple DAG with Soda Data validation

Sunday, 9 June 2024

Software Development with Generative AI - 2024 Update

Why write an update?

Github Copilot Features

Copilot Limitations

AI prompting and commenting issue

Copilot Translation

Saturday, 27 April 2024

Software Engineering Hiring and Firing

This post is NOT about getting your first Software job

F I R E D

H I R E D

csp x service x type x tier x region
3 x 200 x 50 x 3 x 50 = 4.5 million