UniK3D: Universal Camera Monocular 3D Estimation

lpiccinelli-eth.github.io

46 points by rbanffy 2 days ago

ashoeafoot an hour ago

Point louds do not load on firefox mobile

Geee a day ago

Couple of other recent similar papers (this UniK3D might be the best for single image, not sure):

https://vgg-t.github.io/

https://zhanghe3z.github.io/FLARE/

Calwestjobs 2 days ago

ETH is insane last few years, last year literally dirt cheap literally lossless seasonal energy storage : https://ethz.ch/en/news-and-events/eth-news/news/2024/08/iro...

now they have better 3D perception algo than Tesla after 10 years of Musks fraudulent ai claims...

MrLeap a day ago

There's quite a few monocular depth estimation models out there, have been for years. This one looks pretty good. That said, the temporal stability seems pretty wobbly, I don't think I'd use it for a self driving car.
The most impressive example was the point cloud they generated from the extreme fisheye lens, that was nice.
Predicting that the background on cloud city was a flat matte painting is also impressive in a way. It does seem to collapse all far field objects into a single plane. That's a decent compromise for many things.
- Mandelmus a day ago
  
  Another leading monocular depth estimation model, Marigold [1] is also from ETH.
  [1] https://marigoldmonodepth.github.io/
- Calwestjobs a day ago
  
  tesla can not see paper box in my garage.

jteppinette 2 days ago

Would it be possible to "auto" layer the outputs from multiple images?

michaelt 2 days ago

I've seen tools for Generative AI art which do something a lot like that [1]
Generate an image of subject A, apply a depth estimation ML model to figure out which pixels belong to the foreground and cut them out.
Then generate an image of background B with a different prompt, and paste the subject from the first step on top. A sort of automatic collage making process.
[1] https://github.com/Extraltodeus/multi-subject-render
Calwestjobs a day ago

when it is 3d then it can be trivially converted to point cloud (if it is not exportable as point cloud already) and then you can scale, overlay, splice, merge anything with anything trivially. game over !
eu is having plans and making steps for having main highway infrastructure ready for autonomous vehicles, they are changing colors on road signs for years (vertical and horizontal) so cars have higher probability of decoding it, introducing C-ITS, ... This can be another great addition to get us there safely and not by listening to AI nonsense bros. while europe is trying to build stuff while being attacked by russians in more ways than one, https://www.theguardian.com/world/2024/dec/04/up-to-100-susp... US "elites" are doing nothing just talking about impending doom and buying land on Hawaiian islands or new zealand for apocalyptic shelters....

echelon 2 days ago

There's enough crazy stuff in image and video happening now that would have turned heads at SIGGRAPH just a few years ago. It feels like we'll have fully programmable and deformable visual worlds soon.

ttoinou 2 days ago

Can’t wait enough for that. But there are so many new methods and papers coming out every week I dont see how tooling can become stable enough to have a unified workflow that doesnt need a clean slate takeover every 6 months
- echelon 2 days ago
  
  > I dont see how tooling can become stable enough to have a unified workflow that doesnt need a clean slate takeover every 6 months
  That's a good thing. There's so much opportunity to disrupt incumbents now.
  Build for what we have, but don't overfit. Be flexible for what's coming next.