Get Mystery Box with random crypto!

Data Science by ODS.ai 🦜

Logo of telegram channel opendatascience — Data Science by ODS.ai 🦜 D
Logo of telegram channel opendatascience — Data Science by ODS.ai 🦜
Channel address: @opendatascience
Categories: Technologies
Language: English
Subscribers: 51.65K
Description from channel

First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of former. To reach editors contact: @haarrp

Ratings & Reviews

2.67

3 reviews

Reviews can be left only by registered users. All reviews are moderated by admins.

5 stars

1

4 stars

0

3 stars

0

2 stars

1

1 stars

1


The latest Messages 4

2023-04-24 07:31:30 ​​Generative Agents: Interactive Simulacra of Human Behavior

Imagine a world where computational software agents can simulate believable human behavior, empowering a wide range of interactive applications from immersive environments to rehearsal spaces for interpersonal communication and prototyping tools. This paper introduces "generative agents," a groundbreaking concept where agents perform daily routines, engage in creative activities, form opinions, interact with others, and remember and reflect on their experiences as they plan their next day.

To bring generative agents to life, the authors propose an innovative architecture that extends a large language model, allowing agents to store and reflect on their experiences using natural language and dynamically plan their behavior. They showcase the potential of generative agents in an interactive sandbox environment inspired by The Sims, where users can engage with a small town of 25 agents using natural language. The evaluation highlights the agents' ability to autonomously create and navigate complex social situations, producing believable individual and emergent social behaviors. This groundbreaking work demonstrates the critical contributions of observation, planning, and reflection components in agent architecture, laying the foundation for more realistic simulations of human behavior and unlocking exciting possibilities across various applications.

Paper link: https://arxiv.org/abs/2304.03442

Demo link: https://reverie.herokuapp.com/arXiv_Demo/#

A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-ishb

#deeplearning #nlp #generative # simulation
2.8K views04:31
Open / Comment
2023-04-20 15:57:47 ​​DINOv2: Learning Robust Visual Features without Supervision

Get ready for a game-changer in computer vision! Building on the groundbreaking achievements in natural language processing, foundation models are revolutionizing the way we use images in various systems. By generating all-purpose visual features that excel across diverse image distributions and tasks without finetuning, these models are set to redefine the field.

The researchers behind this work have combined cutting-edge techniques to scale pretraining in terms of data and model size, turbocharging the training process like never before. They've devised an ingenious automatic pipeline to create a rich, diverse, and curated image dataset, setting a new standard in the self-supervised literature. To top it off, they've trained a colossal ViT model with a staggering 1 billion parameters and distilled it into a series of smaller, ultra-efficient models. These models outshine the best available all-purpose features, OpenCLIP, on most benchmarks at both image and pixel levels.

A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-dinov2

Project link: https://dinov2.metademolab.com/
#deeplearning #cv #pytorch #imagesegmentation #sota #pretraining
3.3K views12:57
Open / Comment
2023-04-19 21:13:34 Stability AI just released initial set of StableLM-alpha models, with 3B and 7B parameters. 15B and 30B models are on the way.

Base models are released under CC BY-SA-4.0.

StableLM-Alpha models are trained on the new dataset that build on The Pile, which contains 1.5 trillion tokens, roughly 3x the size of The Pile. These models will be trained on up to 1.5 trillion tokens. The context length for these models is 4096 tokens.

As a proof-of-concept, we also fine-tuned the model with Stanford Alpaca's procedure using a combination of five recent datasets for conversational agents: Stanford's Alpaca, Nomic-AI's gpt4all, RyokoAI's ShareGPT52K datasets, Databricks labs' Dolly, and Anthropic's HH. We will be releasing these models as StableLM-Tuned-Alpha.

https://github.com/Stability-AI/StableLM
3.5K views18:13
Open / Comment
2023-04-18 22:11:11
AI / ML / LLM / Transformer Models Timeline

This is a collection of important papers in the area of LLMs and Transformer models.
PDF file.
3.1K views19:11
Open / Comment
2023-04-17 20:55:07
AI for IT Operations (AIOps) on Cloud Platforms: Reviews, Opportunities and Challenges (Salesforce AI)

A review of the AIOps vision, trends challenges and opportunities, specifically focusing on the underlying AI techniques.

1. INTRODUCTION
2. CONTRIBUTION OF THIS SURVEY
3. DATA FOR AIOPS
A. Metrics
B. Logs
C. Traces
D. Other data
4. INCIDENT DETECTION
A. Metrics based Incident Detection
B. Logs based Incident Detection
C. Traces and Multimodal Incident Detection
5. FAILURE PREDICTION
A. Metrics based Failure Prediction
B. Logs based Incident Detection
6. ROOT CAUSE ANALYSIS
A. Metric-based RCA
B. Log-based RCA
C. Trace-based and Multimodal RCA
7. AUTOMATED ACTIONS
A. Automated Remediation
B. Auto-scaling
C. Resource Management
8. FUTURE OF AIOPS
A. Common AI Challenges for AIOps
B. Opportunities and Future Trends
9. CONCLUSION
1.6K views17:55
Open / Comment
2023-04-17 10:19:00 ​​InceptionNeXt: When Inception Meets ConvNeXt

Large-kernel convolutions, such as those employed in ConvNeXt, can improve model performance but often come at the cost of efficiency due to high memory access costs. Although reducing kernel size may increase speed, it often leads to significant performance degradation.

To address this issue, the authors propose InceptionNeXt, which decomposes large-kernel depthwise convolution into four parallel branches along the channel dimension. This new Inception depthwise convolution results in networks with high throughputs and competitive performance. For example, InceptionNeXt-T achieves 1.6x higher training throughputs than ConvNeX-T and a 0.2% top-1 accuracy improvement on ImageNet-1K. InceptionNeXt has the potential to serve as an economical baseline for future architecture design, helping to reduce carbon footprint.

A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-inceptionnext

Paper link:https://arxiv.org/abs/2303.16900

Code link: https://github.com/sail-sg/inceptionnext

#cnn #deeplearning #computervision
2.9K views07:19
Open / Comment
2023-04-10 13:07:26 Paper Review: Segment Anything

- 99% of masks are automatic, i.e. w/o labels;
- Main image encoder model is huge;
- To produce masks you need a prompt or a somewhat accurate bbox (partial bbox fails miserably);
- Trained on 128 / 256 GPUs;
- Most likely - useful a large scale data annotation tool;
- Not sure that it can be used in production as is, also license for the dataset is research only, the model is Apache 2.0

https://andlukyane.com//blog/paper-review-sam

Unless you have a very specific project (i.e. segment just one object type and you have some priors), this can serve as a decent pre-annotation tool.

This is nice, but probably it can offset 10-20% of CV annotation costs.
3.6K views10:07
Open / Comment
2023-04-08 08:00:57 ​​Segment Anything

The Segment Anything project aims to democratize image segmentation in computer vision, a core task used across various applications such as scientific imagery analysis and photo editing. Traditionally, accurate segmentation models require specialized expertise, AI training infrastructure, and large amounts of annotated data. This project introduces a new task, dataset, and model for image segmentation to overcome these challenges and make segmentation more accessible.

The researchers are releasing the Segment Anything Model (SAM) and the Segment Anything 1-Billion mask dataset (SA-1B), the largest segmentation dataset to date. These resources will enable a wide range of applications and further research into foundational models for computer vision. The SA-1B dataset is available for research purposes, while the SAM is provided under the permissive Apache 2.0 open license. Users can explore the demo to try SAM with their own images.

Paper link: https://arxiv.org/abs/2304.02643

Code link: https://github.com/facebookresearch/segment-anything

Demo link: https://segment-anything.com/demo

Blogpost link: https://ai.facebook.com/blog/segment-anything-foundation-model-image-segmentation/

Dataset link: https://ai.facebook.com/datasets/segment-anything/

A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-sam

#deeplearning #cv #pytorch #imagesegmentation #dataset
1.3K views05:00
Open / Comment
2023-04-07 23:05:20 Tabby: Self-hosted AI coding assistant

Self-hosted AI coding assistant. An opensource / on-prem alternative to GitHub Copilot.

- Self-contained, with no need for a DBMS or cloud service
- Web UI for visualizing and configuration models and MLOps.
- OpenAPI interface, easy to integrate with existing infrastructure.
- Consumer level GPU supports (FP-16 weight loading with various optimization).
2.3K views20:05
Open / Comment
2023-04-06 18:08:34
Hey, let’s see how many of us have some Data Science-related vacancies to share. Please submit them through Google Form.

Best vacancies may be published in this channel.

Google Form: link.

#ds_jobs
1.5K views15:08
Open / Comment