My Journey from Manager to Leader

I started my career doing a terrible mistake. I wanted to be a manager.

Well, that’s wrong. I didn’t want to be a manager. I loved tech, loved getting my hands dirty, and loved getting things done. But I listened to people around me, mostly my family who convinced me that dumb people did tech and smart people did management.

I didn’t want to be dumb being a techie. I wanted to look smart when people asked me what I was doing for a living, and looking smart was answering “I’m a tech project manager”.

That was dumb, and I spent 4 years of my life in hell.

I joined a small Web agency as a project manager and I hated every minute of it. As a project manager, my job was to make people accountable for delivering things on time, and making sure the clients requests did not go off budget. It was all about Excel, Gantt, and boring meetings.

I hated every single minute of it. I was trying to get my hands dirty every time I had a slight chance of doing it. I was bad at managing projects, and most of them went late and off budget. I was bad at managing clients, and the had everything they wanted for free. I was bad at managing people, and they were unhappy, crawling under too much work because I had poorly managed the project they had to deliver. My employers were unhappy with my job, so I felt miserable. The only thing that made me happy was designing specs and UX, but it was only a small part of the job.

As things got unbearable for me, I had to leave the company and start over, making the same mistake and feeling even more miserable. I felt into depression, building myself a prison made of loneliness and alcoholism, until that day I left project management and got back into my 14 years long love: system engineering.

After one year back into tech, I started to manage again. I spent 20% of my time managing the small team I was assigned, and 80% getting things done. From an infrastructure management point of view, I was overperforming. From a people management one, it was still terrible.

I was micro managing

I have that super hero syndrome and a few insecurities that made me micro manage everything. Assigning work to people, I always ended doing their stuff because "I will do it faster and better". Unfortunately, this was true most of the time.

The job was done, but people got frustrated, and, thankfully, they told it to me.

During incidents, I took the place on people managing it to make sure things were going back as normal my way. When people had to do complex things, I prevented them to do them because I considered I was much smarter than them.

I didn't think in terms of team

I was still thinking in terms of getting things done. I gave people work to do, they did it, but we were not a team. I used to say "I" instead of "we" when doing my reporting, as if I was doing all the job. To be honest, I was better working alone.

I did not communicate

I had my roadmap but did not consider my people were worth knowing it, so I didn't communicate inside the team. And I considered people outside of the team were smart enough to understand what we were doing. I was the powerful wizard living in his ivory tower with a bunch of servants doing things for me.

When asked for an infrastructure change I didn't want to do, I just said "no" without giving an explanation, because other people are dumb and don't need to understand my reasons.

I was not empowering people

The only thing I cared about was having things delivered on time, on budget, and working. I didn't think about making people feel good, do stuff they love and progress. I considered giving their a raise at the end of the year enough.

Fortunately, things changed and I turned from a (poor) manager into a something else. I turned into a leader.

Such a change is difficult both to start and realize. You need to question yourself first, acknowledge your weaknesses, know where you want to go, and eventually find a path to reach that point.

I have learned to trust people

Micromanagement is often an issue with trust. To stop micro managing, you first need to trust people you work with. I learned it the hard way.

During my summer vacation, we faced a major crisis. A critical database got corrupted. The binary backups were corrupted too. There was a way to recover it, but it was long, complicated and could fail anytime.

This is exactly the kind of crises my inner superhero loves handling since it implies saving the world working 24/24. Unfortunately, I was far away from the office, with a terrible internet connection, so I had to let someone else save the world.

One of my teammates built a plan to recover the data. It was a great plan, with rollback at every step, and B plans in case something did not went as expected. He estimated the time every step would take to make communication easier. Instead of rushing into saving the world, he took time to test his assertions first. He wrote scripts everyone could read to understand the technical part. He made a dedicated Slack channel to log everything he was doing. And he made a flowchart of the plan, so he could highlight every step as they were completed for a better, clearer communication to the company's management.

When he started to implement the plan, I wanted to interfere. I was the boss and the super hero after all. But there was no room for me, no room for my superhero skills, and no room for my instinct powered debugging skills. Even communication to the upper management was handled.

It was the most frustrating thing I ever experienced. He had built a plan I could have imagined. Actually, that plan was much better than anything I could have done. All I could do was follow the operation from time to time, and applause.

I was incredibly frustrated by being kept aside of the world saving operation, but when I came back, I knew I could leave anytime knowing I could trust him to handle any kind of situation.

As frustrating as it was, this experience taught me that I could start trusting people in my team, stop micro managing and start to focus on something else.

I started to think in terms of team

I switched my mindset from thinking about me to thinking about the team as a whole. It took a few steps to get there, because I had to change myself first. Everything started reading one of those useless motivational quotes on the Internet people use to sound smart and inspired.

There's no I in team, but there's a Y in victory.

The first thing I did was stop considering myself as the leader of the team to become a full team member. The only difference was I was I kept doing the reporting, administrative and financial tasks and break the tie in case of conflicting priorities, because someone has to.

I stopped saying "I" and started saying "WE". WE were doing stuff, WE were saving the world, and WE failed sometimes. Successes became team achievements instead of personal ones. As a consequence, I stopped justifying failures by blaming the person who did it, and took the responsibility as a team leader, while keeping the dirty laundry inside the team.

We changed the hiring process too. Instead of interviewing people all alone and deciding whether or not they should be hired, I included the other team members as well. Interviews are done with myself plus someone else, and before making an offer, we all go with the candidate for a lunch or a drink.

If anyone in the team really doesn't feel like working with with a candidate, it's a no go.

I have learned to communicate

Instead of keeping everything for myself, from roadmap to priorities, and from success to failure, I started to communicate inside and outside of the team.

I communicate about the upcoming projects, sometimes months ahead so the whole team know what's going on. They know what technologies they're going to work on so they can start reading about it. They have a clear idea of the infrastructure evolution before it happens, and they can work with a purpose instead of just solving tickets.

I've left my ivory tower. I communicate outside of the team about what's being done. I communicate with users on how we're going to fix their problems and when. I communicate honestly about ongoing incidents even though it makes me uncomfortable when it's caused by a failure from us. And last February, I did a presentation of the infrastructure evolution, team achievements and KPIs in front of the 150 people of the company.

I also have learned to communicate about problems before they get out of hands. As a manager, I often talked people about their problems and weaknesses during the by yearly review. I hate conflict and confrontation so I tried to delay them as much as possible. It was terrible for me, because I stressed out to the max. It was terrible for those people, because they spent a bad hour and did not improve in the meantime. And it was terrible for the team that started to ressent against underperformers. Now, I talk to them every time a problem arises so we can fix it and improve together. I managed to avoid most conflicting times and a lot of stress too.

I have learned to empower people

I guess this is where I actually switched from the manager to the leader mindset. I stopped giving my team work and started to give them a direction instead. As I love hiking in the Alps, I had an analogy in mind. There was a summit we had to climb, and my job was to show them the summit and make sure we can all reach it. Until the next summit.

To achieve that journey together, I have built processes around people instead of building them around tasks.

The daily standup purposed has changed. I don't use our daily meeting to check whether or not people are doing their job. I use it to know how I can help them achieve what they are doing, unlock them when they are stuck, and find the required resources when needed.

I don't assign job anymore (with a few exceptions). People chose what they want to work on instead. It can be because it's something they love, or something they want to learn. It can be operational stuff or R&D. Because people work on things they like most of the time, they are more likely to pick the less interesting, more boring stuff. I also pay attention that people who have done a lot of operations lately get R&Ds projects to refresh their mind and move out the platform routine.

We already did systematic code reviews before shipping. I was the one doing the reviews, but I merged my own code. That changed too. Being a team, anyone can review and merge anyone’s code, including, or starting with mine. It was a critical step. At least two people know what is going on, and and we all get to learn and improve from each other.

Finally, I started to hire people more skilled than me. It was hard at the beginning, because "what if they want to take over my role?" This was a stupid way of thinking. Instead of fearing them, I learn from them, ask them their opinion before I start designing a new component, and listen to their suggestion on how to improve the whole team. I have someone who is much better than me at managing a large infrastructure day to day, but less efficient in case of shitstorm. And I have someone who is much better at the lower layers of the system than me, and that's OK. And I let these people taking the lead on some topics so they can improve, and I don't have to be everywhere anymore.

Conclusion

There's no real conclusion here. The path to being the Captain Kirk style leader is an infinite road I've only started to walk. Thankfully, I'm surrounded by a great team I can count of to reach the next summit.


My Journey from Manager to Leader was originally published in Fred Thoughts on Medium, where people are continuing the conversation by highlighting and responding to this story.

Using HDFS as a Backup Storage

I recently started to think about how I could implement a self hosted, scalable, reliable backend infrastructure. Between 15 years of photos, my music, my family's computer backups and many important files, I have about 30TB of data I don't want to lose.

With malwares, backup is the biggest problem at the digital age. Managing a large infrastructure at work, they have been giving me nightmares for more than a decade. The more machines and data you get, the less "let's spawn a few servers and run rsync to backup all the stuff" work.

  • You need an almost infinite space
  • You quickly become I/O bound as you run parallel backups on tens of servers.
  • Restoration is extremely slow if you need to restore multiple backups hosted on the same server.
  • It's easy to lose tracks of where you backup what, unless you start adding CNAMEs like backup.server.xxx.
  • Losing a backup server means you lose all your backups at once.
  • Adding multiple huge backup servers is damn expensive.
Schrödinger's backups: The condition of any backup is unknown until a restore is attempted.

While working on the problem, I first thought about moving my backups to Amazon S3 / Glacier or OVH Public Cloud Object Storage / Archive. Both solutions are interesting because they solve most of my problems:

  • Unlimited space, so I don't have to worry about scaling my servers.
  • Redundancy, so I don't have to fear to lose my backups.
  • They run "in the cloud" which means less I/O problems (in theory).
  • Restoration is faster (in theory).
  • The price is relatively cheap (about 1000$ / month for 100TB of live data)

Unfortunately, there are also some blocking cons:

  • I didn't want to delegate my backups to a third party, because it implied encrypting EVERYTHING. Encryption implies a lot of CPU, and makes the backups much slower than a simple rsync. And don't tell me about encrypting multiple terabites databases on the fly. It's insane.
  • You don't control the price. If your backup provider doubles their price, you just have to pay or rethink your whole backup policy, which might be even more expensive.
  • I/Os in Amazon S3 & friends are a joke when you need speed.

I started to have a look at various tools and ended thinking about using a HDFS cluster as a backup backend.

  • HDFS works on cluster, which means you don't have to think about filling this or that server anymore.
  • HDFS scales horizontally.
  • HDFS works great with big big files.
  • HDFS splits the big files in chunks, so storing a 10+TB database is easy.
  • HDFS is object storage, so you can easily run mysqldump | xbstream -c | hdfs — to store large MySQL databases.
  • Because you're running of a bunch of servers at the same time, you solve the I/O problems.
  • HDFS manages replication. No more lost backups because a single server crashes.
  • HDFS is perfect for JBOD. No more RAID which costs money and I/Os.
  • You can use small machines with just a bunch of 4 to 6TB spinning disks and let the magic happen.

Once again there are a few cons:

  • HDFS is not so good at managing a gazillon small files.
  • Unlike ZFS / rsnapshot, HDFS does not handle file deduplication natively (but space is cheap)
  • Complexity: you need a full HDFS cluster with name nodes, journal nodes etc…
  • The HDFS client requires the whole Java stack which you don't want to install everywhere.

Implementation

Friday night, I started to work on a quick and dirty POC to provide a HDFS backed backup system.

I started to test it on a small HDFS cluster:

  • 2 small 20$/month servers.
  • 4 * 4TB JBOD spinning disks.

For directories full of small files like /etc/, the throughput is about 30% slower than a simple rsync.

For large files, the throughput is 20% faster than rsync because we're limited by the network.

The good point: restoring a file is not about looking for a needle in a haystack anymore. All my prerequisites are satisfied.

The bad point: complexity. Building even a small HDFS cluster is a bit overkill for your home backup. But for a professional use, it works like a charm.

Photo: Rob Pongsajapan.

If you found this article helpful please tap or click “♥︎”, follow me on Twitterorsubscribe to my Engineering Weekly newsletter.


Using HDFS as a Backup Storage was originally published in Fred Thoughts on Medium, where people are continuing the conversation by highlighting and responding to this story.

Getting Over a Job Interview that Went Wrong Answering these 4 Simple Questions

A while ago, I started interviewing to become a SRE manager at Google. I got rejected at the end of the screening call with the recruiter because I was not technically skilled enough to go further in the process.

When I hanged up the phone, I was devastated. It was the first time someone told me I didn't have the technical skills for a job. And I didn't even talk to an engineer, but to a recruiter. Indeed, these were pre written questions, and he didn't really test my skills, just clicked on some yes / no buttons, but still.

Thankfully, I have built a self recovery process to get over the frustration of being rejected in the professional world. After a good night, I was feeling better and had my mind clear enough to go ahead. So I asked mysekf

  1. Did I prepare the interview enough?
  2. Did I really want that job?
  3. Where did I go wrong?
  4. Based on the interview outcome, where can I improve?

1. Did I prepare the interview enough?

Obviously not.

I had read a bit about Google hiring process in the past, and I went through 4 interviews at Facebook, but I didn't prepare that interview at all. As a consequence, having technical questions during the screening call was a total surprise to me.

Getting prepared is a key part of an interview process. Most companies publish materials about their hiring process, and when they don't, people do it for them. Read about the company itself, its culture, its strategy so you can prepare some questions ahead. And if you're expecting detailed technical questions, work them ahead.

2. Did I really want that job?

That's a tricky question. Working at Google meant being surrounded by smart people who would constantly take me out of my comfort zone. Such challenges is what I'm looking for in my professional life, so maybe I wanted that job. But I didn't really want to do it at Google, which has become too much of a corporation, and has not the agility of a startup anymore.

So I guess the answer was no, and the reason why I didn't actually prepare the interview. But they contacted me, so I decided to give it a try.

3. Where did I go wrong?

I was already seeing myself working at Google in London. I had no doubt I could fail so I started the call without being at 200%. When the technical question started, I completely lost my grip, and I was unable to answer simple questions I actually knew about: "what's an inode?", "What's the system call used to open a socket?", "What are the system calls used to get information about a path or file?". And the most important: "What's the average speed of an unladen swallow?"

Jokes aside, "where did I go wrong" is a tricky question because that's not something the recruiter will tell you and they probably don't have the answer. That's why a good night of sleep is important before you can ask yourself these questions. They require to take distance with the topic.

4. Where can I improve?

Beside preparing the interview, and my knowledge about swallows, there were a few things I knew I could improve at. I've been using UNIX concepts for more than 20 years, but I never studied them formally and in details. So I had a "back to school" weekend, reading documentation and the Linux kernel codes about syscalls, to enhance my knowledge about the UNIX bases. As a result, I learned an interesting bunch of things, including the fact that I knew much more than I thought.

Setting yourself goals to improve is the real key to getting over a job interview than went wrong. They allow to look further than the frustration caused by the rejection and make you stronger for the next time.

Photo: Lloyd Morgan.

If you found this article helpful please tap or click “♥︎”, follow me on Twitterorsubscribe to my Engineering Weekly newsletter.


Getting Over a Job Interview that Went Wrong Answering these 4 Simple Questions was originally published in Fred Thoughts on Medium, where people are continuing the conversation by highlighting and responding to this story.

Resizing your Elasticsearch Indexes in Production

Size does matter

One of the burdens with managing thousands of living indexes within the same Elasticsearch cluster is keeping your shards manageable.

When you first design your index, it's hard to predict how big it's going to be in 1, 3, or 9 months. Starting with too many shards puts lots of pressure on your master nodes. It's even counter productive when you're using routing as it will leave most shards unused.

On the other hands, large shards cause lots of problems too. They're slower to recover, might block the cluster reallocation, and make optimizing impossible. I once ended with 900GB shards, on 1.2TB sized servers, making my life a nightmare.

There's no silver bullet but reindexing your whole indexes, which is not always possible on a production cluster. You have two solutions left:

  • Moving your indexes from one cluster to another.
  • Duplicate your indexes, and use Elasticsearch reindex API with aliases.

Get the sizing right

Experience taught me 10GB shards offers the most competitive balance between allocation speed, nodes balancing, and overall cluster management.

With an average of 2GB for 1 million documents, for example, I'll use the following:

  • From 0 to 4 million documents per index: 1 shard.
  • From 4 to 5 million documents per index: 2 shards, so the index can still grow without causing too much problems in the future.
  • With more than 5 millions documents, (number of documents / 5 million) + 1 shard.

The more data nodes you have, the better it works when you need to work with thousands of huge indexes (up to 300 million documents) in the same cluster.

Here's a small script I'm using to resize and move things. Indexes are prefixed with a version number and aliases are not.

#!/bin/bash
for index in $(list of indexes); do
documents=$(curl -XGET http://cluster:9200/${index}/_count 2>/dev/null | cut -f 2 -d : | cut -f 1 -d ',')

if [ $counter -lt 4000000 ]; then
shards=1
elif [ $counter -lt 5000000 ]; then
shards=2
else
shards=$(( $counter / 5000000 + 1))
fi

new_version=$(( $(echo ${index} | cut -f 1 -d _) + 1))
index_name=$(echo ${index} | cut -f 2 -d _)

curl -XPUT http://cluster:9200/${new_version}${index_name} -d '{
"number_of_shards" : '${shards}'
}'
curl -XPOST http://cluster:9200/_reindex -d '{ 
"source": {
"index": "'${index}'"
},
"dest": {
"index": "'${new_version}${index_name}'"
}
}'
done

Once you've reindexed, you're ready to move the alias to the right index and delete the old one.

Photo: Duncan C.

If you found this article helpful please tap or click “♥︎”, follow me on Twitter orsubscribe to my Engineering Weekly newsletter.


Resizing your Elasticsearch Indexes in Production was originally published in Fred Thoughts on Medium, where people are continuing the conversation by highlighting and responding to this story.

When do you Draw the Line on Helping People (technically) Online?

Last week, Jess Dodson asked that not so weird question every tech people ask themselves at least once a day.

Where do you draw the line for helping someone (technically) online?

Publishing a tech blog and being on IRC technical channels for more than 20 years, I often get technical questions either by email, or on Twitter. These questions go from explaining further something I wrote about to an attempt to get consulting for free.

Facing that kind of situation is not easy. On the one hand, helping people, sharing what you've learned, and helping them progress is nice. Feeling like you have a technical legitimacy is great, even more when you got that bloody impostor syndrome waiting to pop at your face anytime. On the other hand, it takes time, my time, your time is worth money, and they ask you to do what's you're paid for… for free.

I solve the problem by trying to answer a few quick question:

Help decision workflow
  • Is the answer obvious and easily googleable?
  • Do I actually know them?
  • Did they help me on comparable issues in the past or are they help whores?
  • Will their problem require a deep study of their specific case or is it a 1 minute thing?
  • Will helping them also help the community, like troubleshooting a bug or documenting an edge case?
  • Is their problem fun to work on?

Following that workflow, the decision process only takes a few seconds and avoids me to lose my (precious) time on helping lazy help whores or people trying to get consulting for free.

Photo: Anthony Albright.

If you found this article helpful please tap or click “♥︎”, follow me on Twitter or subscribe to my Engineering Weekly newsletter.


When do you Draw the Line on Helping People (technically) Online? was originally published in Fred Thoughts on Medium, where people are continuing the conversation by highlighting and responding to this story.

The 13 Most Thought Provoking Questions I was Asked During a Job Interview

1. Why didn’t you get the kind of position we’re offering you before?

That's a tricky question, because the answer might cost you the job.

I already got offered such a position in the past but it was not what I was interested in then.

Lots of people I've met have a hard time switching from a purely tech position into a more managing one. I used to be one of them until a few years ago, which explains why I switched from a managing position to a full tech one 10 years ago, before I came back to management. I'm still doing 60% of tech though.

Every people have a different career path, so not being interested in this or that role in the the past is an honest and acceptable answer.

2. What scares you in that position?

It might seem cocky from me, but nothing scares me when taking a new position. I'm in constant need of new challenges, and applying to a new position means looking for bigger, tougher ordeals.

Indeed some things might upset me, like being sure the company can pay my salary, turnover getting out of hands, or a crappy coffee machine.

What scares me, however, when starting a new job, is not being able to work with my manager.

I've always worked in small companies, and not being able to get along with your manager makes your daily job terrible. Since most of the time you won't spend much time with them, you need to trust your guts on whether or not you want to work with that person. It's a risky bet, so be sure you ask them enough questions to know how they work. Meet other members of the company if you can, inside and outside your future team. They often have greats insight on the company true spirit.

3. Why wouldn’t I hire you?

Well, I don't see any reason not to hire me. I'm a smart, skilful, reliable, handsome Frenchman, so what more would you expect?

You wouldn't hire me because you eventually realize you don't need someone with my profile.

Last year, I met a company which was looking for their CTO. Something was ringing my bell as they were telling me about the position. There were more than 100 developers and devops to manage, but they were looking for someone with a strong technical skillset, not someone with an extensive management experience. The reason why they didn't hire me was because they were opening a position they didn't need for. Once they understood this, they started to look for an experienced manager with good technical skills, not the other way around.

4. Why would I hire you and not someone with more / less experience?

Asking why they might hire someone with less experience is a way to tell people they're (too) expensive.

You're expensive, so tell me why you're worth the salary you're asking for.

I don't answer with skills, or experience, or why I'm a truly badass guy. I focus my answer on what I can bring the company as a whole. By doing so, you can remove the focus from the role you're applying to, and have the recruiter start to think in a more long term and global way.

Asking why they might hire someone with more experience is a way to tell you they doubt you'd fit the role. When asked this specific question, I refer to my past experience and how I can leverage it for the benefit of the whole company. It make the recruiter think about me inside the company already, and stop focusing about my lack of experience.

5. What would make you join the company?

Two words: challenge and people.

If you can offer me challenges and people who constantly keep me out of my comfort zone, you have good chances to get me. I once left a company because they couldn't give me those challenges anymore.

We spend half of our life at work, so getting bored, even for a lot of money is a complete no go.

6. What would make you leave the company?

3 things might push me out of a company: dishonesty, lack of professional perspectives, and lack of challenges.

If you promise me something and don't deliver, consider I'm out.

Honesty is critical. You expect me to be honest, I expect you to be honest too. Once the trust link is broken, it's over.

7. Why wouldn’t you take the job if I said yes?

Another tricky question, which requires an honest answer.

I'd refuse the job if my company made a good financial counter offer is a terrible answer.

It shows you're more interested by the money than the company you're applying to, and let the interviewer think you'll leave as soon as you get a better offer.

Talk about challenges instead, better career evolution. Also, "nothing" is an acceptable answer if it's honest.

8. What are your absolute no goes?

There's a few things I ask every company during the hiring process. Choosing people I hire, control over the budget, 1–2 days working remotely, shares, and a Mac.

If you consider having me working on a PC running Windows, don't even call me.

9. What would you do first if you were offered that position?

I'd try to define who my clients are. I'd meet with people from the company to understand who they are, how they work, and how I can help them.

Even though I'm working in infrastructure, I need to understand the rest of the company. It gives my job a broader sense of purpose.

10. How would I manage you?

Tell me what you need. Give me some challenges to keep my brain busy for a while. Give me the freedom and resources to achieve them. And judge by the results.

11. What’s your biggest management mistake so far?

I hired someone that was technically good but couldn't deal with managing a living platform. This is something I didn't pay attention to during the hiring process because I was looking for a skilled engineer who could deliver. He ended burning out and had to leave.

Technical skills are not enough, hiring people able to cope with the workload is an important part of hiring and managing too.

12. Would you fire someone if you had to?

Yes. But not before I try to fix things with that person.

When the team existence is at stake, firing someone might be the only, but ultimate solution.

If there's a behavior problem, I'd try to have them fix it before we go further. I did it once with someone, and once they understood what was the problem, they became great. I used to communicate poorly, I worked on that a lot to improve myself because it was a real problem 15 years ago.

If there's a motivation / underperformance problem, we setup a program for one, two months with daily, then weekly one to one and see how things go.

If there's a problem within the team, and the company allows it, that person can switch in another team, and we'll see how things turn out.

But if the team itself is at stake because of one member and nothing can improve, I have no problem with firing someone.

13. What do you expect from technical people you hire?

I expect honesty, curiosity and will to learn.

Honesty is a key asset. When you screw up, admit you've screwed up, learn from your mistakes and go ahead.

When you're late on something, tell it, don't give unrealistic ETAs. When you can't do something, don't stay stuck in a tunnel, ask for help.

Curiosity and will to learn are critical too. You don't have to know everything about everything when you join a team, but you need to learn. Something I hate to hear is "XXX sucks, when I was working at YYY, we were using ZZZ", because there's probably why we're using XXX in the first place. Maybe ZZZ is better than XXX but that behavior shows a lack of curiosity and will to get out of your comfort zone.

Learning new stuff is critical too. We're working in a domain where technology and methodologies evolve quickly, so we need to learn constantly to stay up to date. "We've always done like this" is the best way to kill a company.

If you found this article helpful please tap or click “♥︎”, follow me on Twitter or subscribe to my Engineering Weekly newsletter.

Photo: Lewis Minor.


The 13 Most Thought Provoking Questions I was Asked During a Job Interview was originally published in Fred Thoughts on Medium, where people are continuing the conversation by highlighting and responding to this story.

5 Fantastic Command Line Tools (for Mac too) you’ve Probably Never Heard of

I’m spending 12 to 15 hours a day working in a terminal. As I hate losing time switching between applications, my Web browser and many other things, I've developed a habit of looking for a good command line replacement to most tools I use. This includes task management, listening to music, working with code or managing files.

Most tools I use are well known, but there's also a bunch of them that deserve more fame so here's a list of my favorite. And good news, they all work on Mac OS!

A cool iTunes replacement

Cmus playing Propellerhead

Cmus is a curses based music player and a great alternative to iTunes when you don't care about watching movies bought on the Store. It plays most modern formats including stream for MP3 and OGG. Cmus is blazing fast, including when it comes to scan your music library.

Cmus features include a playlist editor, audio scrobbling, and vi bindings including for search.

Mac users rejoice! Cmus is available on Homebrew:

brew install cmus

A CLI for Todoist

Todoist CLI

After looking for the perfect task manager for almost 20 years, I finally fell in love with Todoist. Todoist is a multi platform task management system with an open API, which allowed Takumasa Sakao to build a great CLI in go.

The Todoist CLI compiles easily on Mac OS but is not available on Homebrew yet.

An old but good file manager

The venerable Midnight Commander

Developed since 1994, Midnight Commander is a clone of the then popular Norton Commander I used to play with on MS DOS. MC is a comprehensive visual file manager, able to navigate inside file archives and do most of the tasks you'd do with Mac OS X Finder, except maybe the image previews.

Mac users, rejoice! Midnight Commander is available on Homebrew.

brew install mc

Interactive filtering

Peco in action

Peco is an interactive filtering tool built in Go I started to use with Todoist. I discovered it recently but it it now one of the things that improved my productivity the most.

Now, I use Peco to filter things like logs, process stats, find files, as I can type as I think and look through the current results. There are even a bunch of tricks you can use to make Peco even more badass.

Mac users rejoice! Peco is available on Homebrew:

brew install peco

A curses frontend to Git

Tig, a curse frontend for git

Tig, which you should not confuse with gti is a comprehensive curses based Git interface. It allows to navigate through branches and commits, view merges and rebases, or compare 2 different branches which is useful before applying a merge.

Mac users rejoice! Tig is available on Homebrew.

brew install tig

That's all folks. There are many other cool CLI I use daily, but they are well known one: nmap, lftp, mutt… and I forget most of them. Have fun!

If you found this article helpful please tap or click “♥︎”, follow me on Twitter orsubscribe to my Engineering Weekly newsletter.


5 Fantastic Command Line Tools (for Mac too) you’ve Probably Never Heard of was originally published in Fred Thoughts on Medium, where people are continuing the conversation by highlighting and responding to this story.

How we Upgraded a 22TB MySQL Cluster from 5.6 to 5.7 (in 9 months)

Yesterday, the Synthesio Coffee Team finished upgrading a 22TB MySQL cluster from Percona 5.6 to Percona 5.7. We already upgraded most of our clusters and knew that one would take time, but we didn’t expect it to take 9 full months. This is what we have learned about migrating giant database clusters without downtime.

The initial setup

Our database cluster is a classic high availability 3 + 1 nodes topology running behind Haproxy. It runs on a Debian Jessie without Systemd with a 4.9.1 kernel (4.4.36 at the beginning). The servers have a 20 core Dual Xeon E5–2660 v3 with 256GB RAM and 36 * 4TB hard drive setup as a RAID10. The throughput is around 100 million writes / day, inserts and updates mixed.

Cluster design

The cluster design itself has nothing special:

  • 2 servers are configured as master / master, but writes are performed on the main master only.
  • Reads are performed on the master and both slaves via a Haproxy configured to remove a slave when the replication lags.
  • A spare slave is running offsite with MASTER_DELAY set to 1 hour in case the little Bobby Tables plays with our servers.

Step 1: in the hell of mysql_upgrade

We upgraded the spare host to MySQL 5.7 using Percona Debian packages.

Upgrade of the spare slave

Upgrading from MySQL 5.6 to MySQL 5.7 requires to upgrade every table having TIME, DATETIME, and TIMESTAMP columns to add support for fractional seconds precision. mysql_upgrade handles that part as well as the system tables upgrade.

Upgrading tables with temporal columns means running an ALTER TABLE … FORCE on every table that requires the upgrade. It meant copying 22TB of data as temporary tables, then load the data back. On spinning disks.

After 5 months, we were 20% done.

We killed mysql_upgrade and wrote a script to run the ALTER on 2 tables in parallel.

2 months later, the upgrade was 50% done and the replication lag around 9 million seconds. A massive replication lag was not in the plans, and it introduced a major unexpected delay in the process.

We decided to upgrade our hardware a bit.

We installed a new host with 12 * 3.8TB SSD disks in RAID0 (don’t do this at home), rsynced the data from the spare host and resumed the process. 8 days later, the upgrade was over. It took 3 more weeks to catch up with the replication.

Step 2: adding 2 new MySQL 5.7 slaves

Before doing this, make sure your cluster has the GTID activated. GTID saved us lots of time and headache as we reconfigured the replication a couple of times. Not sure about friendship, but MASTER_AUTO_POSITION=1 is magic.

We added 2 new slaves running MySQL 5.7. There’s a bug with Percona postinst.sh script when installing a fresh server with MySQL data not in /var/lib/mysql. The database path is hardcoded in the shell script, which cause the install to hang forever. The bug can be bypassed by installing percona-server-server-5.6, then installing percona-server-server-5.7.

Adding 2 new slaves

Once done, we synced the data from the MySQL 5.7 spare host to the new slaves running innobackupex.

On the receiver:

mysql -e "SET GLOBAL expire_logs_days=7;"
nc -l -p 9999 | xbstream -x

On the sender:

innobackupex --stream=xbstream -- parallel=8 ./ | nc slave[3,4] 9999

30 hours later:

innobackupex --use-memory=200G --apply-log .

On Slave 3:

CHANGE MASTER TO
MASTER_HOST="master",
MASTER_USER="user",
MASTER_PASSWORD="password",
MASTER_AUTO_POSITION=1;

On slave 4:

CHANGE MASTER TO
MASTER_HOST="slave 3",
MASTER_USER="user",
MASTER_PASSWORD="password",
MASTER_AUTO_POSITION=1;

Step 3: catching up with the replication (again)

Once again, we were late on the replication. I already wrote about fixing a lagging MySQL replication. Please read carefully the pros and cons before you apply the following configuration:

STOP SLAVE;
SET GLOBAL sync_binlog=0;
SET GLOBAL innodb_flush_log_at_trx_commit=2;
SET GLOBAL innodb_flush_log_at_timeout=1800;
SET GLOBAL slave_parallel_workers=40;
START SLAVE;

Catching up with the replication on both hosts took 2 to 3 weeks mostly because we had much more writes than we usually do.

Step 4: finishing the job

The migration was almost done. There were a few things left to do.

We reconfigured Haproxy to switch the writes on slave 3, which de facto became the new master and the reads on slave 4. Then, we restarted all the writing processes on the platform to kill the remaining connections.

After 5 minutes, slave 3 had caught up with everything from master, and we stopped the replication. We saved the last transaction ID from master in case we would have to rollback.

Then, we reconfigured slave 3 replication to make it a slave of slave 4, so we would run in a master / master configuration again.

We upgraded master in MySQL 5.7, ran innobackupex again and made it a slave of slave 3. The replication took a few days to catch up, then, yesterday, we added master back in the cluster.

After one week, we trashed slave 1 and slave 2 which were no use anymore.

Getting things done

Rollback, did someone say “rollback”?

My old Chinese master had a proverb for every migration:

A good migration goes well but a great migration expects a rollback.

If something got wrong, we had a plan B.

When swithing, we kept the last transaction ran on master, and master was still connected to slave 1 and slave 2. In case or problem, we would have stopped everything, exported the missing transactions, loaded the data and switched back to the origin master + slave 1 + slave 2. Thankfully, this is not something we had to do. Next time, I’ll tell you how we migrated a 6TB Galera cluster from MySQL 5.5 to MySQL 5.7. But later.

If you found this article helpful please tap or click “♥︎”, follow me on Twitter or subscribe to my Engineering Weekly newsletter.

Photo Astrid Westvang.


How we Upgraded a 22TB MySQL Cluster from 5.6 to 5.7 (in 9 months) was originally published in Fred Thoughts on Medium, where people are continuing the conversation by highlighting and responding to this story.

So *why* do I use a Mac?

I just read Jason Mark post So *why* do I use a Mac? and I remember how people have been asking me *why* I use a Mac for more than a decade.

Working on an infrastructure, I spend about 12 hours a day switching between a terminal and a text editor. Added to being an open source advocate, contributor and user for more than 20 years, people around me expect me to use some graphic interface less obscure Linux or BSD distro.

These are stereotypes I used to comply to. I spent 10 years running Linux or FreeBSD on my workstation and my laptop, from 1996 to 2006. Back then I was making fun of people having a Mac for running a non-free operating system with almost no application. Until I bought my first MacBook Pro.

I switched to Mac OS for one reason, I stayed for many. My primary reason to leave the Linux / BSD on the Desktop was a text editor. Back in the days, Textmate was the best text editor ever, and the only one to have a correct support of Ruby (on Rails). Most modern editors like SublimeText inherit from Textmate.

I kept using a Mac (and Textmate) despite great editors being available on Linux for another reason. For a decade, I spent countless hours rebuilding my kernel, compiling KDE nightlies on FreeBSD on a P3 800 Sony Vaio, and tuning all my configurations. With Linux on the Desktop, I spent a decade doing things on my computer, not doing things with my computer. I know Linux distributions have improved a lot in 10 years, but I don't feel like turning back. Today, I leave most of the computer stuff to Mac OS and the App Store and build cool things using a computer. Which I believe is what computers are meant for.

With Linux on the Desktop, I spent a decade doing things ON my computer, not building things WITH my computer.

If you found this article helpful please tap or click “♥︎”, follow me on Twitter or subscribe to my Engineering Weekly newsletter.


So *why* do I use a Mac? was originally published in Fred Thoughts on Medium, where people are continuing the conversation by highlighting and responding to this story.

Ops People, do you Know who YOUR Clients are?

Every time I go to a job interview, I ask the same question: "who are going to be my clients?". Having spent the past 15 years in B2B companies, I get big brand names made to impress, but that's not what I'm asking for. Maybe I should rephrase that, as what I want to know is "As an infrastructure leader, who am I going to provide a service to? Who are the people I can talk with to know if they are satisfied with the infrastructure? Whose needs do I fulfil?" These people are my clients, big names are the company's clients, and that makes a difference.

As an infrastructure leader, who am I going to provide a service to? Whose needs do I fulfil?

When I joined Synthesio in December 2014, I spent most of my first week talking to people from various departments. I wanted to know who they are, how they work and how I could help them. I didn't spend my whole time discovering the infrastructure, I wanted to understand what people were expecting from me. This time spent meeting people helped me to shape my forthcoming infrastructure strategy.

They were lots of expectations.

The R&D team needed up to date, easy to use local development environments, continuous integration, centralized logging and the ability to get an infrastructure for new feature in a few days if not hours.

Product people needed usage metrics I could provide them, a staging environment we call Theory because "it works in theory" and the ability to get the new feature live fast, including large clusters.

Support wanted easy to read business oriented metrics to know if a problem was local to a client, due to a bug or a general infrastructure issue. In other words, not being in the dark anymore when they face an angry clients.

Project managers and sales wanted a stable, reliable, fast infrastructure that won't let them down when they work on the product or during a demo. They wanted a better communication about problems and ETAs about incident resolution.

Talking with people from various departments helped me to understand what they needed and build my infrastructure strategy.

After a few days, I knew who my clients were, what they expected from me and I was able to build an infrastructure strategy: automation, redundancy, reliability and metrics. All I had to do was making priorities, build a plan and deliver.

Building an infrastructure strategy

2 years and a half later, I still talk with people from the other teams a lot to understand if their needs have changed and how I can help. These are informal talks during coffee breaks that allow me to adjust my global strategy and the team priorities.

As an infrastructure leader, I'm client obsessed. I don't consider the infrastructure as an end, but as a way to provide other people a service. It means getting out of my sysadmin ivory tower and understand the world around me. It also means moving fast and (sometimes) breaking things. In my career, I've seen too many IT people refusing to make their infrastructure evolve, and consider the developers and users as a nuisance because it could break what they spent so much time to build. This is not how you deliver a service.

If you found this article helpful please tap or click “♥︎”, follow me on Twitter or subscribe to my Engineering Weekly newsletter.

Photo: Didriks.


Ops People, do you Know who YOUR Clients are? was originally published in Fred Thoughts on Medium, where people are continuing the conversation by highlighting and responding to this story.