USENIX Release Engineering Summit 2015 recap

>> Tuesday, November 24, 2015

November 13th, I attended the USENIX Release Engineering Summit in Washington, DC.  This summit was along side the larger LISA conference at the same venue. Thanks to Dinah McNutt, Gareth Bowles, Chris Cooper,  Dan Tehranian and John O'Duinn for organizing.



I gave two talks at the summit.  One was a long talk on how we have scaled our Android testing infrastructure on AWS, as well as a look back at how it evolved over the years.

Picture by Tim Norris - Creative Commons Attribution-NonCommercial-NoDerivs 2.0 Generic (CC BY-NC-ND 2.0)
https://www.flickr.com/photos/tim_norris/2600844073/sizes/o/

Scaling mobile testing on AWS: Emulators all the way down from Kim Moir

I gave a second lightning talk in the afternoon on the problems we face with our large distributed continuous integration, build and release pipeline, and how we are working to address the issues. The theme of this talk was that managing a large distributed system is like being the caretaker for the water, or some days, the sewer system for a city.  We are constantly looking system leaks and implementing system monitoring. And probably will have to replace it with something new while keeping the existing one running.

Picture by Korona Lacasse - Creative Commons 2.0 Attribution 2.0 Generic https://www.flickr.com/photos/korona4reel/14107877324/sizes/l



In preparation for this talk, I did a lot of reading on complex systems design and designing for recovery from failure in distributed systems.  In particular, I read Donatella Meadows' book Thinking in Systems. (Cate Huston reviewed the book here). I also watched several talks by people who talked about the challenges they face managing their distributed systems including the following:
I'd also like to thank all the members of Mozilla releng/ateam who reviewed my slides and provided feedback before I gave the presentations.
The attendees of the summit attended the same keynote as the LISA attendees.  Jez Humble, well known for his Continuous Delivery and Lean Enterprise books provided a keynote on Lean Configuration Management which I really enjoyed. (Older version of slides from another conference, are available here and here.)



In particular, I enjoyed his discussion of the cultural aspects of devops. I especially like that he stated that "You should not have to have planned downtime or people working outside business hours to release".  He also talked a bit about how many of the leaders that are looked up to as visionaries in the tech industry are known for not treating people very well and this is not a good example to set for others who believe this to be the key to their success.  For instance, he said something like "what more could Steve Jobs have accomplished had he treated his employees less harshly".

Another concept he discussed which I found interesting was that of the strangler application. When moving from a large monolithic application, the goal is to split out the existing functionality into services until the originally application is left with nothing.  Exactly what Mozilla releng is doing as we migrate from Buildbot to taskcluster.


http://www.slideshare.net/jezhumble/architecting-for-continuous-delivery-54192503


At the release engineering summit itself,   Lukas Blakk from Pinterest gave a fantastic talk Stop Releasing off Your Laptop—Implementing a Mobile App Release Management Process from Scratch in a Startup or Small Company.  This included grumpy cat picture to depict how Lukas thought the rest of the company felt when that a more structured release process was implemented.


Lukas also included a timeline of the tasks that implemented in her first six months working at Pinterest. Very impressive to see the transition!


Another talk I enjoyed was Chaos Patterns - Architecting for Failure in Distributed Systems by Jos Boumans of Krux. (Similar slides from an earlier conference here). He talked about some high profile distributed systems that failed and how chaos engineering can help illuminate these issues before they hit you in production.


For instance, it is impossible for Netflix to model their entire system outside of production given that they consume around one third of nightly downstream bandwidth consumption in the US. 

Evan Willey and Dave Liebreich from Pivotal Cloud Foundry gave a talk entitled "Pivotal Cloud Foundry Release Engineering: Moving Integration Upstream Where It Belongs". I found this talk interesting because they talked about how the built Concourse, a CI system that is more scaleable and natively builds pipelines.   Travis and Jenkins are good for small projects but they simply don't scale for large numbers of commits, platforms to test or complicated pipelines. We followed a similar path that led us to develop Taskcluster

There were many more great talks, hopefully more slides will be up soon!

Read more...

The mystery of high pending counts

>> Friday, September 25, 2015

In September, Mozilla release engineering started experiencing high pending counts on our test pools, notably Windows, but also Linux (and consequently Android).  High pending counts mean that there are thousands of jobs queued to run on the machines that are busy running other jobs.  The time developers have to wait for their test results is longer than ideal.


Usually, pending counts clear overnight as less code is pushed during the night (in North America) which invokes fewer builds and tests.  However, as you can see from the graph above, the Windows test pending counts were flat last night. They did not clear up overnight. You will also note that try, which usually comprises 63% of our load, has very highest pending counts compared to other branches.  This is because many people land on try before pushing to other branches, and tests aren't coalesced on try.


The work to determine the cause of high pending counts is always an interesting mystery.
  • Are end to end times for tests increasing?
  • Have more tests been enabled recently?
  • Are retries increasing? (Tests the run multiple times because the initial runs fail due to infrastructure issues)
  • Are jobs that are coalesced being backfilled and consuming capacity?
  • Are tests being chunked into smaller jobs that increase end to end time due to the added start up time?
Mystery by ©Stuart Richards, Creative Commons by-nc-sa 2.0

Joel Maher and I looked at the data for this last week and discovered what we believe to be the source of the problem.  We have determined that since the end of August a number of new test jobs were enabled that increased the compute time per push on Windows by 13% or 2.5 hours per push.  Most of these new test jobs are for e10s
Increase in seconds that new jobs added to the total compute time per push.  (Some existing jobs also reduced their compute time for a total difference about about 2.5 more hours per push on Windows)
The e10s initiative is an important initiative for Mozilla to make Firefox performance and security even better.  However, since new e10s and old tests will continue to run in parallel, we need to get creative on how to have acceptable wait times given the limitations of our current Windows tests pools.  (All of our Windows test run on bare metal in our datacentre, not on Amazon).
 
Release engineering is working to reduce this pending counts given our current hardware constraints with the following initiatives: 

To reduce Linux pending counts:
  • Added 200 new instances to the tst-emulator64 pool (run Android test jobs on Linux emulators) (bug 1204756)
  • In process of adding more Linux32 and Linux64 buildbot masters (bug 1205409) which will allow us to expand our capacity more

Ongoing work to reduce the Windows pending counts:

  • Disable Linux32 Talos tests and redeploy these machines as Windows test machines (bug 1204920 and bug 1208449)
  • Reduce the number of talos jobs by running SETA on talos (bug 1192994)
  • Developer productivity team is investigating whether non-operating specific tests that run on multiple windows test platforms can run on fewer platforms.

How can you help? 

Please be considerate when invoking try pushes and only select the platforms that you explicitly require to test.  Each try push for all platforms and all tests invokes over 800 jobs.

Read more...

Learning Data Science and evidence based teaching methods

>> Friday, July 17, 2015

This spring, I took several online courses on the topic of data science.  I became interested in expanding my skills in this area because as release engineers, we deal with a lot of data.  I wanted to learn new tools to extract useful information from the distributed systems behemoth we manage.


This xckd reminded me of the challenges of managing our buildfarm somedays :-)

From http://xkcd.com/1546/

I took three courses from Coursera's Data Science track from John Hopkins University. As with previous coursera classes I took, all the course material is online (lecture videos and notes).  There are quizzes and assignments that are due each week.  Each course below was about four weeks long.

The Data Scientist's Toolbox - This course was pretty easy. Basically a introduction to the questions that data scientists deal as well a primer on installing R, RStudio (IDE for R), and using GitHub.
R Programming - Introduction to R.  Most of the quizzes and examples used publicly available data for the programming exercises.  I found I had to do a lot of reading in the R API docs or on stackoverflow to finish the assignments.  The lectures didn't provide a lot of the material needed to complete the assignments.  Lots of techniques to learn how to subset data using R which I found quite interesting, reminded me a lot of querying databases with SQL to conduct analysis.
Getting and Cleaning Data - More advanced techniques using R.  Using publicly available data sources to clean different data sources in different formats, XML, excel spreadsheets, comma or tab delimited. Given this data, we had to answer many questions and conduct specific analysis by writing  R programs.  The assignments were pretty challenging and took a long time. Again, the course material didn't really cover all the material you needed to do the assignments so a lot of additional reading was required.

There are six more courses in the Data Science track that I'll start tackling again in the fall that cover subjects such as reproducible research, statistical inference and machine learning.   My next coursera  class is Introduction to Systems Engineering which I'll start in a couple of weeks.  I've really become interested in learning more about this subject after reading Thinking in Systems.

The other course I took this spring was the Software Carpentry Instructor training course.   The Software Carpentry Foundation teachers researchers basic software skills.  For instance, if you are a biologist analyzing large data sets it would be useful to learn how to use R, Python, and version control to store the code you wrote to share with others.  These are not skills that many scientists acquire in their formal university training, and learning them allows them to work more productively.  The instructor course was excellent, thanks Greg Wilson for your work teaching us.

We read two books for this course:
Building a Better Teacher: An interesting overview of how teacher is taught in different countries and how to make it more effective. Most important: Have more opportunities for other teachers to observe your classroom and provide feedback which I found analogous to how code review makes us better software developers.
How Learning Works: Seven Research-Based Principles for Smart Teaching: A book summarizing the research in disciplines such as education, cognitive science and psychology on the effective techniques for teaching students new material.  How assessing student's prior knowledge can help you better design your lessons, how to to ask questions to determine what material students are failing to grasp, how to understand student's motivation for learning and more.  Really interesting research.

For the instructor course, we met every couple of weeks online where Greg would conduct a short discussion on some of the topics on a conference call and we would discuss via etherpad interactively. We would then meet in smaller groups later in the week to conduct practice teaching exercises.  We also submitted example lessons to the course repo on GitHub. The final project for the course was to conduct a short lesson to a group of instructors that gave feedback, and submit a pull request to update an existing lesson with a fix.  Then we are ready to sign up to teach a Software Carpentry course!

In conclusion, data science is a great skill to have if you are managing large distributed systems.  Also, using evidence based teaching methods to help others learn is the way to go!

Other fun data science examples include
Tracking down the Villains: Outlier Detection at Netflix - detecting rogue servers with machine learning
Finding Shoe Stores in 100k Merchants: Using Data to Group All Things - finding out what Shopify merchants sell shoes using Apache Spark and more
Looking Through Camera Lenses: The Application of Computer Vision at Etsy

Read more...

Test job reduction by the numbers

>> Tuesday, June 16, 2015

In an earlier post,  I wrote how we had reduced the amount of test jobs that run on two branches to allow us to scale our infrastructure more effectively.  We run the tests that historically identify regressions more often.  The ones that don't, we skip on every Nth push.  We now have data on how this reduced the number of jobs we run since we began implementation in April.

We run SETA on two branches (mozilla-inbound and fx-team) and on 18 types of builds.  Collectively, these two branches represent about 20% of pushes each month.  Implementing SETA allowed us to move  from ~400 -> ~240 jobs per push on these two branches1 We run the tests identified as not reporting regressions on every 10th commit or 90 minutes since the last test was scheduled.  We run the critical tests on every commit.2

Reduction in number of jobs per push on mozilla-inbound as SETA scheduling is rolled out

A graph for the fx-team branch shows a similar trend. It was a staged rollout starting in early April, as I enabled platforms and as the SETA data became available. The dip in early June reflects where I enabled SETA for Android 4.3.

This data will continue to be updated in our scheduling configuration as it evolves and is updated by the code that Joel and Vaibhav wrote to analyze regressions. The analysis identifies that there were

Jobs to ignore: 440
Jobs to run: 114
Total number of jobs: 554

which is significant.  Our buildbot configurations are updated the latest SETA data with every reconfig, which occurs usually occurs every couple of days.

The platforms configured to run fewer tests for both opt and debug are

        MacOSX (10.6, 10.10)
        Windows (XP, 7, 8)
        Ubuntu 12.04 for linux32, linux64 and ASAN x64
        Android 2.3 armv7 API 9
        Android 4.3 armv7 API 11+

Additional info
1Tests may have been disabled/added at the same time,  this is not taken into account
2There still some scheduling issues to be fixed see bug 1174870  and bug 1174746 for further details

Read more...

Mozilla pushes - May 2015

>> Friday, June 12, 2015

Here's May 2015's monthly analysis of the pushes to our Mozilla development trees. You can load the data as an HTML page or as a json file.


Trends

The number of pushes decreased from those recorded in the previous month (8894) with a total of 8363. 

 
Highlights

  • 8363 pushes
  • 270 pushes/day (average)
  • Highest number of pushes/day: 445 pushes on May 21, 2015
  • 16.03 pushes/hour (highest average)

General Remarks
  • Try has around 62% of all the pushes now
  • The three integration repositories (fx-team, mozilla-inbound and b2g-inbound) account around 27% of all the pushes.

Records
  • August 2014 was the month with most pushes (13090  pushes)
  • August 2014 had the highest pushes/day average with 422 pushes/day
  • July 2014 had the highest average of "pushes-per-hour" with 23.51 pushes/hour
  • October 8, 2014 had the highest number of pushes in one day with 715 pushes 



Read more...

Mozilla pushes - April 2015

>> Friday, May 01, 2015

Here's April 2015's  monthly analysis of the pushes to our Mozilla development trees. You can load the data as an HTML page or as a json file.  


Trends
The number of pushes decreased from those recorded in the previous month with a total of 8894.  This is due to the fact that gaia-try is managed by taskcluster and thus these jobs don't appear in the buildbot scheduling databases anymore which this report tracks.


Highlights

  • 8894 pushes
  • 296 pushes/day (average)
  • Highest number of pushes/day: 528 pushes on Apr 1, 2015
  • 17.87 pushes/hour (highest average)

General Remarks

  • Try has around 58% of all the pushes now that we no longer track gaia-try
  • The three integration repositories (fx-team, mozilla-inbound and b2g-inbound) account around 28% of all the pushes.

Records

  • August 2014 was the month with most pushes (13090  pushes)
  • August 2014 had the highest pushes/day average with 422 pushes/day
  • July 2014 had the highest average of "pushes-per-hour" with 23.51 pushes/hour
  • October 8, 2014 had the highest number of pushes in one day with 715 pushes  



Note
I've changed the graphs to only track 2015 data.  Last month they were tracking 2014 data as well but it looked crowded so I updated them.  Here's a graph showing the number of pushes over the last few years for comparison.



Read more...

Releng 2015 program now available

>> Tuesday, April 28, 2015

Releng 2015 will take place in concert with ICSE in Florence, Italy on May 19, 2015. The program is now available. Register here!

via romana in firenze by ©pinomoscato, Creative Commons by-nc-sa 2.0



Read more...

Less testing, same great Firefox taste!


Running a large continuous integration farm forces you to deal with many dynamic inputs coupled with capacity constraints. The number of pushes increase.  People add more tests.  We build and test on a new platform.  If the number of machines available remains static, the computing time associated with a single push will increase.  You can scale this for platforms that you build and test in the cloud (for us - Linux and Android on emulators), but this costs more money.  Adding hardware for other platforms such as Mac and Windows in data centres is also costly and time consuming.

Do we really need to run every test on every commit? If not, which tests should be run?  How often do they need to be run in order to catch regressions in a timely manner (i.e. able to bisect where the regression occurred)


Several months ago, jmaher and vaibhav1994, wrote code to analyze the test data and determine the minimum number of tests required to run to identify regressions.  They named their software SETA (search for extraneous test automation). They used historical data to determine the minimum set of tests that needed to be run to catch historical regressions.  Previously, we coalesced tests on a number of platforms to mitigate too many jobs being queued for too few machines.  However, this was not the best way to proceed because it reduced the number of times we ran all tests, not just less useful ones.  SETA allows us to run a subset of tests on every commit that historically have caught regressions.  We still run all the test suites, but at a specified interval. 

SETI – The Search for Extraterrestrial Intelligence by ©encouragement, Creative Commons by-nc-sa 2.0
In the last few weeks, I've implemented SETA scheduling in our our buildbot configs to use the data that the analysis that Vaibhav and Joel  implemented.  Currently, it's implemented on mozilla-inbound and fx-team branches which in aggregate represent around 19.6% (March 2015 data) of total pushes to the trees.  The platforms configured to run fewer tests for both opt and debug are
  • MacOSX (10.6, 10.10)
  • Windows (XP, 7, 8)
  • Ubuntu 12.04 for linux32, linux64 and ASAN x64
  • Android 2.3 armv7 API 9

As we gather more SETA data for newer platforms, such as Android 4.3, we can implement SETA scheduling for it as well and reduce our test load.  We continue to run the full suite of tests on all platforms other branches other than m-i and fx-team, such as mozilla-central, try, and the beta and release branches. If we did miss a regression by reducing the tests, it would appear on other branches mozilla-central. We will continue to update our configs to incorporate SETA data as it changes.

How does SETA scheduling work?
We specify the tests that we would like to run on a reduced schedule in our buildbot configs.  For instance, this specifies that we would like to run these debug tests on every 10th commit or if we reach a timeout of 5400 seconds between tests.

http://hg.mozilla.org/build/buildbot-configs/file/2d9e77a87dfa/mozilla-tests/config_seta.py#l692


Previously, catlee had implemented a scheduling in buildbot that allowed us to coallesce jobs on a certain branch and platform using EveryNthScheduler.  However, as it was originally implemented, it didn't allow us to specify tests to skip, such as mochitest-3 debug on MacOSX 10.10 on mozilla-inbound.  It would only allow us to skip all the debug or opt tests for a certain platform and branch.

I modified misc.py to parse the configs and create a dictionary for each test specifying the interval at which the test should be skipped and the timeout interval.  If the tests has these parameters specified, it should be scheduled using the  EveryNthScheduler instead of the default scheduler.

http://hg.mozilla.org/build/buildbotcustom/file/728dc76b5ad0/misc.py#l2727
There are still some quirks to work out but I think it is working out well so far. I'll have some graphs in a future post on how this reduced our test load. 

Further reading
Joel Maher: SETA – Search for Extraneous Test Automation



Read more...

Mozilla pushes - March 2015

>> Wednesday, April 15, 2015

Here's March 2015's  monthly analysis of the pushes to our Mozilla development trees. You can load the data as an HTML page or as a json file.

Trends
The number of pushes increased from those recorded in the previous month with a total of 10943. 

Highlights

  • 10943 pushes
  • 353 pushes/day (average)
  • Highest number of pushes/day: 579 pushes on Mar 11, 2015
  • 23.18 pushes/hour (highest average)

General Remarks
  • Try keeps on having around 49% of all the pushes
  • The three integration repositories (fx-team, mozilla-inbound and b2g-inbound) account around 26% of all the pushes.

Records
  • August 2014 was the month with most pushes (13090  pushes)
  • August 2014 had the highest pushes/day average with 422 pushes/day
  • July 2014 had the highest average of "pushes-per-hour" with 23.51 pushes/hour
  • October 8, 2014 had the highest number of pushes in one day with 715 pushes 





Read more...

Scaling Yosemite

>> Friday, March 20, 2015

We migrated most of our Mac OS X 10.8 (Mountain Lion) test machines to 10.10.2 (Yosemite) this quarter.

This project had two major constraints:
1) Use the existing hardware pool (~100 r5 mac minis)
2) Keep wait times sane1.  (The machines are constantly running tests most of the day due to the distributed nature of the Mozilla community and this had to continue during the migration.)

So basically upgrade all the machines without letting people notice what you're doing!

Yosemite Valley - Tunnel View Sunrise by ©jeffkrause, Creative Commons by-nc-sa 2.0

Why didn't we just buy more minis and add them to the existing pool of test machines?
  1. We run performance tests and thus need to have all the machines running the same hardware within a pool so performance comparisons are valid.  If we buy new hardware, we need to replace the entire pool at once.  Machines with different hardware specifications = useless performance test comparisons.
  2. We tried to purchase some used machines with the same hardware specs as our existing machines.  However, we couldn't find a source for them.  As Apple stops production of old mini hardware each time they announce a new one, they are difficult and expensive to source.
Apple Pi by ©apionid, Creative Commons by-nc-sa 2.0

Given that Yosemite was released last October, why we are only upgrading our test pool now?  We wait until the population of users running a new platform2 surpass those the old one before switching.

Mountain Lion -> Yosemite is an easy upgrade on your laptop.  It's not as simple when you're updating production machines that run tests at scale.

The first step was to pull a few machines out of production and verify the Puppet configuration was working.  In Puppet, you can specify commands to only run certain operating system versions. So we implemented several commands to accommodate changes for Yosemite. For instance, changing the default scrollbar behaviour, new services that interfere with test runs needed to be disabled, debug tests required new Apple security permissions configured etc.

Once the Puppet configuration was stable, I updated our configs so the people could run tests on Try and allocated a few machines to this pool. We opened bugs for tests that failed on Yosemite but passed on other platforms.  This was a very iterative process.  Run tests on try.  Look at failures, file bugs, fix test manifests. Once we had to the opt (functional) tests in a green state on try, we could start the migration.

Migration strategy
  • Disable selected Mountain Lion machines from the production pool
  • Reimage as Yosemite, update DNS and let them puppetize
  • Land patches to disable Mountain Lion tests and enable corresponding Yosemite tests on selected branches
  • Enable Yosemite machines to take production jobs
  • Reconfig so the buildbot master enable new Yosemite builders and schedule jobs appropriately
  • Repeat this process in batches
    • Enable Yosemite opt and performance tests on trunk (gecko >= 39) (50 machines)
    • Enable Yosemite debug (25 more machines)
    • Enable Yosemite on mozilla-aurora (15 more machines)
We currently have 14 machines left on Mountain Lion for mozilla-beta and mozilla-release branches.

As a I mentioned earlier, the two constraints with this project were to use the existing hardware pool that constantly runs tests in production and keep the existing wait times sane.  We encountered two major problems that impeded that goal:

It's a compliment when people say things like "I didn't realize that you updated a platform" because it means the upgrade did not cause large scale fires for all to see.  So it was a nice to hear that from one of my colleagues this week.

Thanks to philor, RyanVM and jmaher for opening bugs with respect to failing tests and greening them up.  Thanks to coop for many code reviews. Thanks dividehex for reimaging all the machines in batches and to arr for her valiant attempts to source new-to-us minis!

References
1Wait times represent the time from when a job is added to the scheduler database until it actually starts running. We usually try to keep this to under 15 minutes but this really varies on how many machines we have in the pool.
2We run tests for our products on a matrix of operating systems and operating system versions. The terminology for operating system x version in many release engineering shops is a platform.  To add to this, the list of platform we support varies across branches.  For instance, if we're going to deprecate a platform, we'll let this change ride the trains to release.

Further reading
Bug 1121175: [Tracking] Fix failing tests on Mac OSX 10.10 
Bug 1121199: Green up 10.10 tests currently failing on try 
Bug 1126493: rollout 10.10 tests in a way that doesn't impact wait times
Bug 1144206: investigate what is causing frequent talos failures on 10.10
Bug 1125998: Debug tests initially took 1.5-2x longer to complete on Yosemite


Why don't you just run these tests in the cloud?
  1. The Apple EULA severely restricts virtualization on Mac hardware. 
  2. I don't know of any major cloud vendors that offer the Mac as a platform.  Those that claim they do are actually renting racks of Macs on a dedicated per host basis.  This does not have the inherent scaling and associated cost saving of cloud computing.  In addition, the APIs to manage the machines at scale aren't there.
  3. We manage ~350 Mac minis.  We have more experience scaling Apple hardware than many vendors. Not many places run CI at Mozilla scale :-) Hopefully this will change and we'll be able to scale testing on Mac products like we do for Android and Linux in a cloud.

Read more...

Mozilla pushes - February 2015

>> Tuesday, March 17, 2015

Here's February's 2015 monthly analysis of the pushes to our Mozilla development trees. You can load the data as an HTML page or as a json file.

Trends
Although February is a shorter month, the number of pushes were close to those recorded in the previous month.  We had a higher average number of daily pushes (358) than in January (348).

Highlights
10015 pushes
358 pushes/day (average)
Highest number of pushes/day: 574 pushes on Feb 25, 2015
23.18 pushes/hour (highest)

General Remarks
Try had around 46% of all the pushes
The three integration repositories (fx-team, mozilla-inbound and b2g-inbound) account around 22% of all the pushes

Records
August 2014 was the month with most pushes (13090  pushes)
August 2014 has the highest pushes/day average with 422 pushes/day
July 2014 has the highest average of "pushes-per-hour" with 23.51 pushes/hour
October 8, 2014 had the highest number of pushes in one day with 715 pushes 





Read more...

Release Engineering special issue now available

>> Wednesday, February 25, 2015

The release engineering special issue of IEEE software was published yesterday. (Download pdf here).  This issue focuses on the current state of release engineering, from both an industry and research perspective. Lots of exciting work happening in this field!

I'm interviewed in the roundtable article on the future of release engineering, along with Chuck Rossi of Facebook and Boris Debic of Google.  Interesting discussions on the current state of release engineering at organizations that scale large number of builds and tests, and release frequently.  As well,  the challenges with mobile releases versus web deployments are discussed. And finally, a discussion of how to find good release engineers, and what the future may hold.

Thanks to the other guest editors on this issue -  Stephany Bellomo, Tamara Marshall-Klein, Bram Adams, Foutse Khomh and Christian Bird - for all their hard work that make this happen!


As an aside, when I opened the issue, the image on the front cover made me laugh.  It's reminiscent of the cover on a mid-century science fiction anthology.  I showed Mr. Releng and he said "Robot birds? That is EXACTLY how I pictured working in releng."  Maybe it's meant to represent that we let software fly free.  In any case, I must go back to tending the flock of robotic avian overlords.

Read more...

Mozilla pushes - January 2015

>> Friday, February 13, 2015

Here's January 2015's monthly analysis of the pushes to our Mozilla development trees. You can load the data as an HTML page or as a json file.

Trends
We're back to regular volume after the holidays. Also, it's really cold outside in some parts of the of the Mozilla world.  Maybe committing code > going outside.


Highlights
10798 pushes
348 pushes/day (average)
Highest number of pushes/day: 562 pushes on Jan 28, 2015
18.65 pushes/hour (highest)

General Remarks
Try had around around 42% of all the pushes
The three integration repositories (fx-team, mozilla-inbound and b2g-inbound) account around 24% of all of the pushes

Records
August 2014 was the month with most pushes (13,090  pushes)
August 2014 has the highest pushes/day average with 422 pushes/day
July 2014 has the highest average of "pushes-per-hour" with 23.51 pushes/hour
October 8, 2014 had the highest number of pushes in one day with 715 pushes 




Read more...

Reminder: Releng 2015 submissions due Friday, January 23

>> Wednesday, January 21, 2015

Just a reminder that submissions for the Releng 2015 conference are due this Friday, January 23. 

It will be held on May 19, 2015 in Florence Italy.

If you've done recent work like

  • migrating your build or test pipeline to the cloud
  • switching to a new build system
  • migrating to a new version control system
  • optimized your configuration management system or switched to a new one
  • implemented continuous integration for mobile devices
  • reduced end to end build times
  • or anything else build, release, configuration and test related
we'd love to hear from you.  Please consider submitting a talk!

In addition, if you have colleagues that work in this space that might have interesting topics to discuss at this workshop, please forward this information. I'm happy to talk to people about the submission process or possible topics if there are questions.

Il Duomo di Firenze by ©eddi_07, Creative Commons by-nc-sa 2.0


Sono nel comitato che organizza la conferenza Releng 2015 che si terrà il 19 Maggio 2015 a Firenze. La scadenza per l’invio dei paper è il 23 Gennaio 2015.

http://releng.polymtl.ca/RELENG2015/html/index.html

se avete competenze in:
  • migrazione del sistema di build o dei test nel cloud
  • aggiornamento del processo di build
  • migrazione ad un nuovo sistema di version control
  • ottimizzazione o aggiornamento del configuration management system
  • implementazione di un sistema di continuos integration per dispositivi mobili
  • riduzione dei tempi di build
  • qualsiasi cambiamento che abbia migliorato il sistema di build/test/release
e volete discutere della vostra esperienza, inviateci una proposta di talk!

Per favore inoltrate questa richiesta ai vostri colleghi e alle persone interessate a questi argomenti. Nel caso ci fossero domande sul processo di invio o sui temi di discussione, non esitate a contattarmi.

(Thanks Massimo for helping with the Italian translation).

More information
Releng 2015 web page
Releng 2015 CFP now open

Read more...

Mozilla pushes - December 2014

>> Thursday, January 08, 2015


Here's December 2014's monthly analysis of the pushes to our Mozilla development trees. You can load the data as an HTML page or as a json file.

Trends
There was a low number of pushes this month.  I expect this is due to the Mozilla all-hands in Portland in early December where we were encouraged to meet up with other teams instead of coding :-) and the holidays at the end of the month for many countries.
As as side node, in 2014 we had a total number of 124423 pushes, compared to 79233 in 2013 which represents a growth rate of 57% this year.

Highlights
7836 pushes
253 pushes/day (average)
Highest number of pushes/day: 706 pushes on Dec 17, 2014
15.25 pushes/hour (highest)

General Remarks
Try had around around 46% of all the pushes
The three integration repositories (fx-team, mozilla-inbound and b2g-inbound) account around 23% of all of the pushes

Records
August 2014 was the month with most pushes (13,090  pushes)
August 2014 has the highest pushes/day average with 422 pushes/day
July 2014 has the highest average of "pushes-per-hour" with 23.51 pushes/hour
October 8, 2014 had the highest number of pushes in one day with 715 pushes 







Read more...

  © Blogger template Simple n' Sweet by Ourblogtemplates.com 2009

Back to TOP