All posts by apfelkraut

GitHub Copilot – Your AI-powered accomplice to steal code?

Last week GitHub and its parent company Microsoft announced “GitHub Copilot – their/your new AI pair programmer”. E.g. The New Stack, The Verge or CNBC have reported extensively about it. And there is a lot of buzz around this new service, especially within the Open Source and Free Software world. Not only by its developers, but also among its supporting lawyers and legal experts, although the actual news is not that ground breaking, because it is not the first of its kind. Similar ML-/AI-based offers like Tabnine, Kite, CodeGuru, and IntelliCode are already out there, which have also been trained with public code.

Copilot currently is in “technical preview” and planned to be offered as commercial version according to GitHub.

Illustration: GitHub Inc. © 2021

The core of it appears to be OpenAI Codex, a descendant of the famous GPT-3 for natural language processing. According to its homepage it “[…] has been trained on a selection of English language and source code from publicly available sources, including code in public repositories on GitHub”. Update 2021/07/08: GitHub Support appears to have confirmed that all public code at GitHub was used as training data.

GitHub is the platform where the majority of source code of the global Open Source community has meanwhile been accumulated: 65+ million developers, 200+ million repositories (as of 2021) or 23+ million owners of 128+ million public repositories (as of 2020). Alternatives to it have become scarce as long as you do not want to host it on your own.

Great, in what amazing times we are living in! Sounds like with Copilot you do not need your human co-programmers any longer, who assisted you during the good old times in form of pair-programming or code review. Lucky you and especially your employer. On top you will save precious time because it will help you to directly fix a bug, write typical functions or even “[…] learn how to use a new framework without spending most of your time spelunking through the docs or searching the web”. Not to forget about copying & pasting useful code fragments from Stackoverflow or other publicly available sources like GitHub.

At the same time, two essential questions arise, in case you care a bit about authorship:

  1. Did the training of the AI infringe any copyright of the original authors who actually wrote the code that was used as training data?
  2. Will you violate any copyright by including Copilot’s code suggestions in your source code?

Let’s not talk about another aspect that GitHub mentions in their FAQs – personal data: “[…] In some cases, the model will suggest what appears to be personal data – email addresses, phone numbers, access keys, etc. […]”

Continue reading GitHub Copilot – Your AI-powered accomplice to steal code?

The impact of Open Source within the European Union

The results of the Open Source Impact Study tasked by the European Commission have been widely discussed mainly because of its numbers. Though being announced just now, the study identified for the year 2018 a contribution of 0.4% to the GDP worth EUR 63 billion by FOSS, if measured by the increase in commits. 10% more contributors would even raise the GDP of the European Union by 0.6% (EUR 95 billion). The overall cost-benefit ratio is estimated with at least 1:4.

But it gets even more interesting, when looking into the results of the accompanying survey covering about 900 stakeholders (mainly companies) from all around Europe.

For them, incentives for using and investing in Open Source have been, sorted by relevance:

  1. finding technical solutions
  2. avoiding vendor lock-in
  3. carrying forward the state of the art of technology
  4. knowledge creation

As benefits they have seen:

  • support of open standards and interoperability
  • access to source code
  • independence from proprietary providers of software

Within the participants the cost-benefit ratio has been estimated even with 1:10.

Quite some news outlets have reported about the presentation of the study’s findings at the OpenForum Europe Policy Summit 2021, though the final report to the Commission is still pending.

English: “How much are open-source developers really worth? Hundreds of billions of dollars, say economists” by Daphne Leprince-Ringuet
German: “Studie: Open Source trägt 95 Milliarden Euro zur EU-Wirtschaftskraft bei” by Stefan Krempl

Update 2021/02/15 – Netzpolitik.org hat heute auch ein Interview mit dem maßgeblich an der Studie beteiligten Innovationsforscher Knut Blind veröffentlicht: “Open Source braucht öffentliche Finanzierung” von Alexander Fanta

Update 2021/09/06 – The full report has now been published: “Study about the impact of open source software and hardware on technological independence, competitiveness and innovation in the EU economy”.

Virtual Conference Experiences

The current circumstances also forced conferences (those gatherings with really large audiences) completely into cyberspace. Some sticked with traditional approaches to stream talks via off-the-shelf videoconferencing applications and built upon the integrated very limited interaction features offered by these poor proprietary tools. Others have gone complete new ways and brought fascinating and well working concepts on how to still successfully connect the crowds to enable lively conversations and facilitate the exchange of knowledge and experiences in a distant environment.

Let’s start with rc3 and its virtual conference venue in form of rc3 world, implemented with Work Adventure. In a pixel-2D-adventure-style you could walk around the area and as soon as you are approaching other characters, a live audio and video stream with those humans or other live forms controlling the character would open. Limited to 4-5 persons at a time, it allowed you to talk directly with each other – face to face. Due to the limitation of participants you were still able to have a working conversation.

Somehow you needed to get used to having an unexpected and sudden interaction with one and another – on live video, but still it brought back the heavily missed opportunity to get in personal touch with other participants who are sharing possibly similar interests.

rc3 world (screenshot by derstandard.at)

The FOSDEM 2021, the worlds biggest conference on Free and Open Source Software usually taking place in Bruxelles, had for me a very convincing overall concept. The organizers and infrastructure artists have done a tremendous job that allowed for the most impressive conference experience so far and for long. Naturally and purely based on Free Software, at its core matrix, element, and Jitsi.

How did it work and what was so great about it?

Presentations of specific areas of interest had been summarized in virtual rooms with a fixed agenda, like in most physical conferences. Participants logged into a chat infrastructure which represented the rooms by group conversations. You would simply join the room(s) that you are interested in and could start texting with each other and the speakers like on IRC. Talks had been recorded beforehand and where automatically started – by the computer (systemd) – at their scheduled time. Its audio and video were streamed right above your chat window. When the talk ended, the Q&As were streamed live for a fixed amount of time within that room until the next talk started auto-playing according to schedule. During that first part of the Q&A session of a talk, moderators where clarifying upvoted questions and comments from the chat and interacting realtime with the presenters. Those interested could then continue discussing with the speakers and further extend their conversation by switching to a separate room. So per talk you had a dedicated room for the second part of the Q&A that would open shortly after and even allowed anyone there to interact live via audio and video.

In sum that meant that you could check the schedule for topics you are interested in, connect at the announced time and be sure to really listen to that talk instead of watching tech staff doing mic checks or heavily delayed earlier talks whilst being unsure about if and when the one you came for would actually start.

In addition the highly valued Q&A and following backstage (and off the record) conversations could still take place without interrupting or being interrupted by the subsequent talk.

Just impressive and so useful! Thanks a lot to all who made this happen and work that well! These concepts are now here to stay, even when conferences will hopefully resume soon back in the physical world.

2021/02/15 – Updated link of [matrix] to point at the now available summary of their efforts for FOSDEM 2021.

HowTo: Remote Working

I guess we all have meanwhile learned more or less successful how to work from home (or the beach). The all-remote company GitLab Inc. has an helpful and extensive guide on how they are implementing this within their global organization – without a single office. It was inspiring to me especially during the first lockdown of the pandemic, although I thought I had already gained quite some experience during 1.5 yrs working remotely. The subsequent recommendations probably contain also advice originated from GitLab, but are enriched with my personal flavor and learnings. I hope it is useful and shall at least serve as a reminder to myself. I fear there are still some months left to professionalize this even further …

  • Intentionally start and end your work day. For example by a specific ritual, e.g. start with a morning walk, shower, … and end at least by closing your work related applications and their connections when you are done. Otherwise private and work life might mix too much and there won’t be any time for recreation and clearing your mind.
  • “Never assume the motives of others are, to them, less noble than yours are to you.” [according to J.P. Barlow] Not only a helpful advice in remote work, but especially in an isolated environment like home, certain inquiries by your colleagues might appear out of context. If the other person then even catches you in the wrong moment or mood, to have an open, impartial, and collaborative conversation will be a challenge, at least for yourself. Assume that their personal motivation for approaching you is by default meant well and make sure to appreciate it.
  • Do not demonize doing private stuff for and within breaks. While working from home you will most probably be more focused and accordingly exhaust more quickly. Taking a break for doing something completely different like sports, laundry, or just going to the grocery store might at first glance appear improperly. In the end it will spur your creativity and with a refreshed mind you will be more efficient within the same timeframe than forcing you to continue working just for the sake of being on duty.
  • Optimize for asynchronous communication. The frequency of conference (“sync”) calls naturally increases in a remote setting, schedules will overlap, and participants might not always be that focused as they are during in-person meetings. Allow others to follow up offline, at a later point in time, and help to continue where you last left off so that everyones valuable time is invested well. In addition you might want to record important conversations and at least keep detailed minutes that assist to easily catch up. Moreover prefer group conversations over direct messages, as there are always others for whom this is or will become relevant as well.
  • Accept external disturbances during meetings. It is just inevitable to be interrupted in your home environment. Kids, the postman ringing, or unstable infrastructure. Appreciate such unintended breaks, do the best out of it, welcome the young generation or resort the discussion as you anyhow cannot change it. And hey, those interruptions are not that much different to a beamer suddenly breaking down or colleagues bursting into a double-booked conference room?
  • Plan your week and set your priorities pro-actively. Otherwise your work stream will be dictated by allegedly “urgent” concerns popping in and by natural distractions due to your home environment. Better define beforehand what is really important for you to work on. It will be much more easier to resume after uncontrolled interruptions and you always have an overview of where you currently are and what needs to be done next.
  • Electronic mail and communication platforms like Teams, Slack, etc. are asynchronous communication channels. This might come as surprise for some. Do neither feel obliged to immediately reply nor expect a prompt reply. If there is urgency, a direct call is still well suited for instant follow-up and without wasting time by interpreting probably ambiguous text-based chats. In case of conflict, clarify your “service-level agreement” and how to reach you best in case of emergency.
  • Prefer video calls over voice calls. If infrastructure allows, switch on your camera. It will still bring you much closer to your colleagues and add an additional sense for improved interpersonal communication.
  • Fight screen fatigue. Do everything you can to not make you or your counterparts stare on the screen all day long. E.g. when you need to do something creative, better choose pen and paper. While doing a break, do not read the mails or news again on your screen. Better take a walk, enjoy some fresh air, and wide angle view. When you give a training/workshop/lesson, frequently enable participants to turn away from the conference session to do some self-responsible tasks. In general reduce frequency and duration of meetings to a minimum.

In case you want to look from a statistical perspective on home office, Atlasssian Inc. has analyzed the first months after people started to work primarily from home. The company is offering popular workflow solutions that are mainly used within software development, e.g. ever heard of Jira? Their study is a useful sample of remote work from within the software industry. Surprisingly employees still kept working although they are not at the office any longer. The author is even slightly worried of our work-life balance: “Proof our work-life balance is in danger (but there’s still hope)” by Arik Friedman.

(stupid white old) man on a parking lot

Reports have become rare in which drivers were misled by their car’s navigation system and accidentally slipped into a river, expecting a not yet build bridge or desperately ended up on a steep mountain trail, non-suited for anything wider than a goat. In sum you would feel save to assume that nowadays route-based guidance has improved significantly and at the same time humans have learned to not always trust the machines. Most not all.

Continue reading (stupid white old) man on a parking lot

Apollo in Sync

The Apollo space flight program is long history, even the 50th anniversary of Apollo 11 was already celebrated more than a year ago. Todays headlines in spaceflight are either written by a psychologically conspicuous president building a Space Force or a pot smoking business magnate launching electric cars in space and polluting the night sky (and earth orbit) with his up to 42,000 Internet satellites.

In contrast Ben Feist (Homepage | Twitter) has done phenomenal work in reviving the Apollo fever. Or how else can you call it when you are able to replay some of the original missions in real time – second by second from start to end – whilst having the ability to switch interactively between all mission control audio channels, public commentaries, and various video streams. In addition the multimedia content has been enriched with photographs, transcripts, and many more details. All available sources have been carefully restored, synchronized, and packed into an intuitive and original user interface.

Its seems that all of this fabulous work has taken years. Ben Feist’s first blog post describing the idea is of 2012. The first public release was announced three years later, adding additional features in 2016. Oh and that was only about Apollo 17, not to mention Apollo 11 and Apollo 13. In the end it seems to have brought him a job with NASA and collaborators like Stephen Slater, David Charney, Chris Bennett, Arnfinn Holderer, and Robin Wheeler.

Reading the statistics of the included real-time elements for e.g. Apollo 11 is just stunning: All mission control film footage, all TV transmissions and onboard film footage, 2,000 photographs, 11,000 hours of Mission Control audio, 240 hours of space-to-ground audio, all onboard recorder audio, 15,000 searchable utterances, …

Pro tip: Make sure that you do not have any further plans for the day, before you click on https://apolloinrealtime.org/.

Aquanaute extraordinaire

Wer kennt nicht Jacques-Yves Cousteau und zumindest einen seiner legendären Film wie “Die schweigende Welt”? Der tauchende Biologe und Naturphotograph Laurent Ballesta, Jahrgang 1974 aus Montpellier, war mir dagegen bisher unbekannt. Vor einiger Zeit hatte ich bereits, aber unbewusst, einen Film von ihm gesehen, der mich bleibend fasziniert hat: “Antarktis – Die Reise der Pinguine”. Seine weiteren Expeditionen in die Unterwasserwelt stehen diesem Werk in nichts nach und es würde schwer fallen, sich für eines als das Beste zu entscheiden. Allesamt zeigen atemberaubende Bilder, schildern fesselnde Erlebnisse und liefern detaillierte Einblicke in die besuchten Lebensräume und Ökosysteme. Am Ende jeden Abenteuers hat man das Gefühl, als müsste man selbst erst den ein oder anderen Deko-Stopp einlegen, um überhaupt wieder in die reale Welt zurück zu finden. Mit anderen Worten – nicht nur in der aktuellen Zeit – Prädikat besonders wertvoll und schwerstens zu empfehlen!

Zur Zeit hat ARTE in der Sendereihe “Die Tiefen der Ozeane” einige Filme von bzw. mit Laurent Ballesta im Angebot, die bis zu ihrer Depublikation nachfolgend verlinkt sind:

aus “Planète Méditerranée” (c) Laurent Ballesta

Homepage: https://laurentballesta.com/
See also: