'Deep Learning' 카테고리의 글 목록 (6 Page)

https://modelscope.cn/studios/stepfun-ai/GOT_official_online_demo GOT官方DemoGOT-OCR-2.0官方在线体验Demo，通过统一的端到端模型实现 OCR-2.0www.modelscope.cn 위 사이트에서 GOT OCR 2.0 을 돌려볼 수 있다. 성능이 좋다는데, 한국어도 잘 인식할까? Document 문자는 인식은 되는거로 보이나, 띄어쓰기를 인식 못하는 거 같다. 어쨌든 한국어가 학습데이터로 있기는 한 거 같다. 논문 읽어보면 나오려나 Natural Image의 경우 한국어 인식을 잘 못하는 거 같다.

Deep Learning 2024. 10. 26. 18:16

LLM 한국어 이해 성능 파악하기 좋은 벤치마크, 리더보드, article 등

해당 게시글은 2024년 10월 26일에 작성된 글입니다. 작성 날짜에 유의하여 보시길 바랍니다.https://lk.instruct.kr/ LogicKor | 한국어 언어모델 다분야 사고력 벤치마크LogicKor은 한국어 언어모델의 다분야 사고력을 측정하는 벤치마크입니다. 추론, 수학, 글쓰기, 코딩, 이해, 문법 등 다양한 분야의 사고력을 측정합니다.lk.instruct.kr Api호출식이 아니라 모델 직접 다운로드 받아서 할 거라면 비공개 모델란 체크 해제하고 찾아보면 될듯하다.https://wandb.ai/wandb-korea/korean-llm-leaderboard/reports/-LLM---Vmlldzo3MzIyNDE2?accessToken=95bffmg3gwblgohulknz7go3h66k11..

Deep Learning 2024. 10. 25. 21:22

Llama3 Bllossom 8B RTX 3070 8GB GPU 에서 돌아갈까?

깔짝 깔짝 LLM, VLM 에 관심을 기웃 기웃 거려보고 있습니다. 아직 잘 모르고 예제 코드만 돌려본 정도 입니다. 먼저 Llama3 8B 모델의 경우 모델 웨이트 업로드에만 15~16GB 가 소요됐었습니다. 이러면... 집에서 3070 8GB GPU로 이것 저것 뭔가 해보려는 저는 할 수가 없습니다. 모델 일부만 GPU에 업로드 한다거나? 하는 방식이 있지 않을까 싶은데 아직 찾아보지는 않았습니다. 보니까 llm 모델들은 대체로 4bit quantization 모델도 같이 공개가 되는 경우가 많더라고요! 이중에는 서울과학기술대학교 주도로 학습되고 공개된 웨이트가 존재하였습니다. https://huggingface.co/MLP-KTLim/llama-3-Korean-Bllossom-8B-gguf-Q4_..

Deep Learning 2024. 10. 22. 00:21

돌려보고 싶은데 귀찮아서 망설이고 있는 Human Detection Model MMPedestron (2) 좀 친해지려고 노력중

https://developer0hye.tistory.com/753 돌려보고 싶은데 귀찮아서 망설이고 있는 Human Detection Model MMPedestron (1)ECCV2024에 "When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset" 란 제목의 논문이 Accept 된 것을 확인했습니다. https://arxiv.org/pdf/2407.10125 https://github.com/BubblyYi/MMPedestron Multi-Moddeveloper0hye.tistory.com 오류가 막~막~ 난다. https://www.youtube.com/watch?v=R_YdAer_H7..

Deep Learning 2024. 10. 19. 23:55

돌려보고 싶은데 귀찮아서 망설이고 있는 Human Detection Model MMPedestron (1)

ECCV2024에 "When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset" 란 제목의 논문이 Accept 된 것을 확인했습니다. https://arxiv.org/pdf/2407.10125 https://github.com/BubblyYi/MMPedestron Multi-Modal 이런 키워드는 고사하고 RGB 도메인의 Human Detection 성능을 보았을때 기존 Vision Foundation Model이라 할 수 있는 InternImage 보다 훨씬 작은 모델 사이즈로 유사한 성능을 보이고 있습니다. https://github.com/developer0hye/yolov8-vs-yolo..

Deep Learning 2024. 10. 7. 22:48

이미지 생성 모델 FLUX 맛보기

https://huggingface.co/black-forest-labs/FLUX.1-dev black-forest-labs/FLUX.1-dev · Hugging FaceFLUX.1 [dev] is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions. For more information, please read our blog post. Key Features Cutting-edge output quality, second only to our state-of-the-art model FLUXhuggingface.co https://huggingface.co/black-for..

Deep Learning 2024. 10. 6. 04:32

YOLOv8, YOLO11 성능 정량적 비교

https://github.com/developer0hye/yolov8-vs-yolo11/tree/main GitHub - developer0hye/yolov8-vs-yolo11: The average precision per class for the YOLOv8 and YOLO11 pre-trained on the COCO datasThe average precision per class for the YOLOv8 and YOLO11 pre-trained on the COCO dataset - developer0hye/yolov8-vs-yolo11github.com 정리 후 느낀점은 요새 핫한 LLM, VLM 세계에서는 스케일링을 통해 AGI로 향해 가고 있다면(sLLM도 있긴 하지만), YOLO..

Deep Learning 2024. 10. 3. 20:51

COCO Pretrained YOLOv8 클래스별 AP

https://github.com/developer0hye/coco-pretrained-yolov8-ap-per-class GitHub - developer0hye/coco-pretrained-yolov8-ap-per-class: The Average Precision per class for the YOLOv8 model pre-trained onThe Average Precision per class for the YOLOv8 model pre-trained on the COCO dataset - developer0hye/coco-pretrained-yolov8-ap-per-classgithub.com mAP만 보기엔 정보가 너무 함축돼있어서 다른 클래스에 대한 여러 메트릭 값을 csv파일로 저장하여..

Deep Learning 2024. 9. 13. 18:47

Sapiens: Foundation for Human Vision Models 리뷰

Sapiens 프로젝트 페이지 Sapiens | MetaFoundation models for human vision tasksabout.meta.com Sapiens 깃허브 Repo GitHub - facebookresearch/sapiens: High-resolution models for human tasks.High-resolution models for human tasks. Contribute to facebookresearch/sapiens development by creating an account on GitHub.github.com Sapiens Arxiv Sapiens: Foundation for Human Vision ModelsWe present Sapiens, a family ..

Deep Learning 2024. 8. 26. 22:59

Getting ViT in Shape:Scaling Laws for Compute-Optimal Model Design

심심하면 timm 프로젝트에 어떤 모델들 추가되는지 확인하는데 며칠전에 보다가 vit_so400m~ 이런 모델이 있는 걸 알게됐다. https://github.com/huggingface/pytorch-image-models/blob/main/timm/models/vision_transformer.py pytorch-image-models/timm/models/vision_transformer.py at main · huggingface/pytorch-image-modelsThe largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weig..

Deep Learning 2024. 8. 25. 10:10

이전 1 ··· 3 4 5 6 7 8 9 ··· 17 다음

이전 다음

공지사항

최근에 올라온 글

최근에 달린 댓글

Total

Today

Yesterday

링크

TAG more

« 2025/08 »
일	월	화	수	목	금	토
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

글 보관함

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

지속 가능한 꾸준함

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역