Apr 21, 2023, Innovations

AI Flash – What’s cracking in the world of AI?

Paweł Pająk Senior .NET Developer
image
Welcome to the first installment of our AI Flash! As you have probably noticed, we are living in times when AI is blessed with inconceivable progress. Nearly every day, we hear about breakthroughs in the world of Artificial Intelligence. It gets very tricky to catch up with all this news. Hopefully, this newsletter will help you keep up to date. I’ll go through all these innovations, so you don’t have to (although you should!).

Catching the stuff up

In recent years, and even months, we have been witnessing amazing progress. Many new technologies that previously could have only been possible in dreams and science-fiction emerged, and boy, we are not stopping! Some things that were considered astonishing, cutting edge, state-of-the-art a year ago, can be considered outdated and legacy today.

Now, let’s just briefly (and I mean it) summarize the most important recent discoveries, before we move to the newies, so we can stay on the same boat.

GPT

Yes, you got it, there’s no talking about AI without mentioning GPT… This model has been with us since about 2018, but it got really crazy with the recent release of ChatGPT. Well, in the AI world, it wasn’t so recent, we are talking a history here, although timewise it was like a year ago, at the moment of writing. But that doesn’t matter, since I told you – a history… 

What’s this GPT? It’s an AI model capable of doing almost anything. Trained on an enormous amount of data it is supposed to perform very well in any given task, as opposed to other models that usually were fine tuned to achieve a certain task. Simply speaking – ask it anything and it shall do it. Solve your homework, write a code for you? Sure!

Midjourney, DALLE-2, Stable Diffusion

If, for any reason, you haven’t seen any of those, then, oh boy, oh boy, prepare to have your mind blown! We are talking about generating images by AI, from our text prompt. What does it mean? You simply type, in a natural language, what you want to see in the picture, and the AI generates it for you. Simple as that!

What’s remarkable is that these applications get better every day, so if you have seen them a year ago, and they left you unimpressed, then I strongly encourage you to give it another shot.

image

Sample image generated using Midjourney (yes, generated! That’s not a stock image!)

image

Santa Claus programming in space. Also made with Midjourney.

GitHub Copilot

Ever wondered what it would be like to do pair programming, but if your colleague would just cut to the chase, without his irritating remarks about your font size, and the way your icons are placed on the desktop? We’ve got you covered! GitHub Copilot is your AI assistant, integrated into your IDE (many integrations available e.g., Visual Studio, Jet Brains, VS Code). It can help write and complete the code with its suggestions. It can be a great time saver. Occasionally, it can even come up with very nice solutions one wouldn’t think of! Definitely worth trying!

image

Copilot in action

Newies

GPT-4

The latest version of GPT model. With its release, it made “old” ChatGPT look like a toy. Improved reasoning, less “hallucinations”, less likely to make things up. It can be used in the same way as the chat model (gpt-3.5) was used, but it’s more secure – it shouldn’t tell a user how to make a bomb, even if asked nicely. 

The great change is increased context. GPT-4 comes in two variants, 8k and 32k (tokens per context). What does that really mean? Its “memory” can hold up to about 8 or 32 thousand words, giving a possibility to generate longer outputs or being able to process way more data. 

You can literally paste the whole documentation of a library and ask it to generate some code based on it. It will do it, even if that library was created yesterday (remember it has a knowledge cutoff in late 2021) or was simply never published.

OpenAI states that GPT-4 can also understand images, but this feature is still in the preview, and is not publicly available.

If you want to try it for yourself, you can either get ChatGPT plus subscription or join the API access waitlist.

image
image

GPT-4 showing improvements over GPT-3.5

Bing AI

If you ask someone what’s the biggest limitation of Chat-GPT, I believe they might answer its problem with a knowledge cutoff and the fact it can make up facts. Why can’t it just ‘update’ its knowledge… or something? Well, it can’t, that’s the way it works, but that’s a story for a different occasion. 

So, what could we do to overcome this? Microsoft has an answer! Bing AI is using Chat-GPT under the hood (probably the GPT-4 version), but for each request it tries to do a web search to get knowledge required to answer a certain question. Basically, it searches the web for you. What’s really great is that it also lists sources, so you can quickly verify if what it said was true!

To use it, simply head over to the bing.com and go to “chat” and add yourself to the waiting list.

image

Bing AI answering according to the latest web search results

Whisper

While Whisper by OpenAI is not the newest kid on the block, surprisingly many people haven’t heard about it, so I feel obliged to show it. Whisper is a speech-to-text neural network that simply works great! It behaves very well. I’ve tested it in some harsh conditions, with mumbling and lisping, with superb results. Moreover, it’s open source! And achieves impressive results with multiple languages (works excellent even with my native language – Polish)! Like, no reasons not to like it, at all.

You can host it yourself, or use it via OpenAI API. 

There’s also a space where you can test it, so go give it a shot!

ElevenLabs

Text to speech… We’ve heard this before. Well, yes, except this time it’s really awesome! You can either create your own voice from existing presets, and adjust it or (this is huge) you can upload a sample of any voice and then use it to convert text to speech! Now, your chatbot can speak in your voice (or any other you can imagine, but please remember about copyrights and ethics in general).

Have a look at this fantastic demo, or try it yourself (there’s a free version, but to unlock the voice cloning option you need a paid subscription).

I hope you enjoyed this journey into the world of AI. 

 

Bottom Line: 

This whole issue has been written without using ChatGPT. Call it old school if you like…  😉

Share