Get Mystery Box with random crypto!

Data Science by ODS.ai 🦜

Logo of telegram channel opendatascience — Data Science by ODS.ai 🦜 D
Logo of telegram channel opendatascience — Data Science by ODS.ai 🦜
Channel address: @opendatascience
Categories: Technologies
Language: English
Subscribers: 51.65K
Description from channel

First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of former. To reach editors contact: @haarrp

Ratings & Reviews

2.67

3 reviews

Reviews can be left only by registered users. All reviews are moderated by admins.

5 stars

1

4 stars

0

3 stars

0

2 stars

1

1 stars

1


The latest Messages 6

2023-03-31 18:31:21
An AST-based Code Change Representation and its Performance in Just-in-time Vulnerability Prediction

Authors propose a novel way of representing changes in source code, the Code Change Tree, a form that is designed to keep only the differences between two abstract syntax trees of Java source code. The appoach was evaluated in predicting if a code change introduces a vulnerability against multiple representation types and evaluated them by a number of machine learning models as a baseline. The evaluation is done on a novel dataset VIC.

RQ. 1 Can a vulnerability introducing database generated from a vulnerability fixing commit database be used for vulnerability prediction?
RQ. 2 How effective are Code Change Trees in representing source code changes?
RQ. 3 Are source code metrics sufficient to represent code changes?

dataset paper
VIC dataset
3.1K views15:31
Open / Comment
2023-03-30 10:56:54 Adobe does image generation

> Adobe announced a beta of Firefly, a generative ML tool for making images, Unlike MidJourney or Stable Diffusion (or Bing) this looks a lot more like an actual product - instead of typing 50-100 works into a box trying to refine your results, there are GUI tools and settings. It also has a much more clearly-defined set of training data - note that Getty is suing Stable Diffusion for training on its images without permission. In more normal times this would be a huge story - now it’s only half way down the page.

https://firefly.adobe.com/?ref=lore.ghost.io

This really looks like a product. Also numerous tags and knobs are probably sourced from internal Adobe data.

Lots of networks here - upscaling, cycle-gan like domain transfers, inpainting, editing, plain generation, etc

I understand that their demos are probably cherry picked af, but proper product work is evident. Also probably this shows the real niche these tools are meant to occupy. Not the "AGI".

Also evident that the data requirements and scale to pull this off are huge.
2.6K views07:56
Open / Comment
2023-03-29 16:42:53 Sparks of Artificial General Intelligence: Early experiments with GPT-4

TLDR: Paper from #Microsoft research about #GPT4 showing something which can be considered signs of #AGI.


ArXiV: https://arxiv.org/abs/2303.12712
2.7K views13:42
Open / Comment
2023-03-29 00:45:49 My experience with PyTorch 2.0 so far:

[1] - packaging?
[2] - compilation errors

We will test other models as well.
614 views21:45
Open / Comment
2023-03-27 14:08:13 ​​ReBotNet: Fast Real-time Video Enhancement

The authors introduce a novel Recurrent Bottleneck Mixer Network (ReBotNet) method, designed for real-time video enhancement in practical scenarios, such as live video calls and video streams. ReBotNet employs a dual-branch framework, where one branch focuses on learning spatio-temporal features, and the other aims to enhance temporal consistency. A common decoder combines the features from both branches to generate the improved frame. This method incorporates a recurrent training approach that utilizes predictions from previous frames for more efficient enhancement and superior temporal consistency.

To assess ReBotNet, the authors use two new datasets that simulate real-world situations and show that their technique surpasses existing methods in terms of reduced computations, decreased memory requirements, and quicker inference times.

Paper: https://arxiv.org/abs/2303.13504
Project link: https://jeya-maria-jose.github.io/rebotnet-web/

A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-rebotnet

#deeplearning #cv #MachineLearning #VideoEnhancement #AI #Innovation #RealTimeVideo
2.6K views11:08
Open / Comment
2023-03-27 10:07:45 Do large language models need sensory grounding for meaning and understanding?

TLDR: Yes

Slides from philosophical debate by Yann LeCun, who claimed Auto-Regressive LLMs are exponentially diverging diffusion processes.


#LLM #YanLeCun
3.3K views07:07
Open / Comment
2023-03-26 11:11:37 Interview of Ilya Sutskver

TLDR: thereotically #chatgpt can learn a lot and eventually converge to #AGI given the proper dataset and help of #RLHF (Reinforcement Learning from Human Feedback).

Video provides valuable insights into the current state and future of artificial intelligence. The conversation explores the progress of AI, its limitations, and the importance of reinforcement learning and ethics in AI development. Ilia also discusses the potential benefits of AI in democracy and its potential role in helping humans manage society. This interview offers a comprehensive and thought-provoking overview of the AI landscape, making it a must-watch for anyone interested in understanding the impact of AI on our lives and the world at large.

Youtube:



#youtube #Sutskever #OpenAI #GPTEditor
1.5K views08:11
Open / Comment
2023-03-20 16:04:44 Tracking the Fake GitHub Star Black Market with Dagster, dbt and BigQuery

This is a simple Dagster project to analyze the number of fake GitHub stars on any GitHub repository:
https://github.com/dagster-io/fake-star-detector
2.4K views13:04
Open / Comment
2023-03-20 13:34:23 ​​Hyena Hierarchy: Towards Larger Convolutional Language Models

Attention has been a cornerstone of deep learning, but it comes at a steep cost: quadratic expense in sequence length. This can limit the amount of context accessible, making it challenging for subquadratic methods like low-rank and sparse approximations to achieve comparable performance. That's where Hyena comes in!

Hyena is a revolutionary subquadratic drop-in replacement for attention that combines implicitly parametrized long convolutions and data-controlled gating. And the results speak for themselves! Hyena significantly improves accuracy in recall and reasoning tasks on long sequences, matching attention-based models.

In fact, Hyena sets a new state-of-the-art for dense-attention-free architectures in language modeling, reaching Transformer quality with 20% less training compute at sequence length 2K. And that's not all! Hyena operators are twice as fast as optimized attention at sequence length 8K and 100x faster at sequence length 64K.

Paper: https://arxiv.org/abs/2302.10866
Code link: https://github.com/HazyResearch/safari
Project link: https://hazyresearch.stanford.edu/blog/2023-03-07-hyena

A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-hyena

#deeplearning #nlp #cv #languagemodel #convolution
3.1K views10:34
Open / Comment
2023-03-17 22:47:49 In the meantime, some slides from my talks on NLP in 2022

https://docs.google.com/presentation/d/1m7Wpzaowbvi2je6nQERXyfQ0bzzS0dD0OArWznfOjHE/edit
1.6K views19:47
Open / Comment