Home » Unicode vs UTF-8 | utf 8 คือ | อัพเดทความรู้การเขียนโปรแกรมใหม่ที่นี่

Unicode vs UTF-8 | utf 8 คือ | อัพเดทความรู้การเขียนโปรแกรมใหม่ที่นี่

by Apinya Chakrii

คุณกำลังมองหาหัวข้อUnicode vs UTF-8ใช่ไหม?ถ้าเป็นเช่นนั้น โปรดดูวิดีโอด้านล่าง

Unicode vs UTF-8 | แบ่งปันความรู้การเขียนโปรแกรมที่เป็นประโยชน์ที่นี่

ชมวิดีโอด้านล่าง

>>https://liefinternational.comเราหวังว่าข้อมูลที่เราให้ไว้จะมีความสำคัญกับคุณมาก ขอบคุณสำหรับการตรวจสอบข้อมูลนี้.

รูปภาพที่เกี่ยวข้องกับหัวข้อUnicode vs UTF-8

Unicode vs UTF-8

Unicode vs UTF-8


คุณสามารถดูเพลงใหม่และเพลงอัปเดตได้ที่นี่: https://liefinternational.com/learn-to-program

ข้อมูลเกี่ยวกับเรื่องutf 8 คือ

ความสัมพันธ์ระหว่าง Unicode และ UTF-8 คืออะไร? ทำไมคนถึงสับสน? ในวิดีโอนี้ ฉันพยายามอธิบายแนวคิดเบื้องหลังชุดอักขระและ Unicode, การเข้ารหัส และ UTF8 อย่างตรงไปตรงมาสำหรับผู้ที่ไม่มั่นใจ 100% จากนั้นฉันก็เปลี่ยนเกียร์และเล่าเรื่องราวในอดีตของตัวเอง ซึ่งฉันหวังว่าวิดีโอนี้จะมีอยู่จริงเพื่อที่จะได้ดูมันซ้ำแล้วซ้ำอีก ติดตามฉันทาง Twitter: ฉันสตรีมบน Twitch: .

การค้นหาที่เกี่ยวข้อง utf 8 คือ.

#Unicode #UTF8

UTF-8,Encoding,Computer Programming,Unicode,STEM Education,Computer History,Standardization,Programming Tips,character encoding,utf 8

Unicode vs UTF-8

utf 8 คือ.

READ  ระบบสื่อสารโทรคมนาคม (เมืองไทยสมาร์ทบุ๊ก) | สื่อสาร โทรคมนาคม | อัพเดทความรู้การเขียนโปรแกรมใหม่ที่นี่

You may also like

20 comments

William Swartzendruber 29/08/2021 - 22:02

3:23 – That's ASCII along with an 8-bit extension. Basic ASCII by itself is only 7-bit, ranging from 0-127.
3:54 – That is not the GB 18030 binary representation of the number 六; 0xC1 0xF9 is, which is only two bytes.

You're also overcomplicating this. Just say that Unicode defines a set of characters and assigns each one a unique number. UTF-8 and UTF-16 (and others) are simply ways of representing those number in memory. It is furthermore possible to convert between UTF-16 and UTF-8, as both can represent all possible Unicode values. Also mention that UTF-8 was developed to be compatible with basic ASCII, and that UTF-16 was designed to be compatible with UCS-2.

Reply
Deceptive facade 29/08/2021 - 22:02

i have watched too many videos to understand this and finally nod and understood from your video, thank you so much!

Reply
Erik K 29/08/2021 - 22:02

I’m a bit confused with two questions:

At 4:10, you call UTF-8 an ‘encoding’ (aka function) that maps bytes to unicode.

Then at 7:10, you say to map unicode code points to bytes is to ‘encode’ while the reverse is to ‘decode’. Shouldn’t UTF-8 be considered a… ‘decoding’ instead of an ‘encoding’? Or maybe using the word ‘encoding’ as a synonym for ‘function’ intrinsically leads to confusion.

My second question is… is UTF-8 a well-defined function?? Like, a sequence of bytes maps to exactly one unicode value via UTF-8? I think not because the link below says a unicode character can consist of MULTIPLE code points:

https://riptutorial.com/unicode/topic/6485/characters-can-consist-of-multiple-code-points

Reply
Sivakumar Anbazhagan 29/08/2021 - 22:02

i was ashamed that i couldnt grasp thos concept despite being a DB admin for 12 years until someone asked about creating a db as a unicode db or non unicode.I took the redpill and went deep into the rabbit hole.

Reply
Jade 29/08/2021 - 22:02

看懂了,感谢!

Reply
jvsnyc 29/08/2021 - 22:02

In my Windows 10 20H2 version, I see the encoding choices listed as:
ANSI
UTF-16 LE
UTF-16 BE
UTF-8
UTF-8 with BOM
Which would no longer lead to so much confusion as what you had showed. Microsoft arguably made things much worse by pre-pending UTF-8 encoded files with a BOM, which for those who know UTF-8 makes no sense, as there is no such thing as BE or LE in UTF-8, it is unambiguously just UTF-8 and BE or LE doesn't apply to it. So it is good to see UTF-8 by default without the meaningless BOM show up as the simpler looking of the two here nowadays, and not to see one named Unicode. They were trying to disambiguate between ANSI and UTF-8 up-front with the BOM, but because nobody else did that, a whole lot of people got very confused by it. I have seen that happen this year, people couldn't understand what the bad data was at the top of the UTF-8 file they had which they were not expecting, because they knew that "UTF-8 doesn't have a BOM", and it doesn't, unless it came from some Microsoft programs.
It is definitely true that a full understanding of all character sets and encodings fills not just a book, but an encyclopedia.
The basics were confused because early adopters of Unicode "knew" that every character in Unicode fit in 16-bits, because for many years it did. Famous examples were Microsoft Windows and Java. So calling UCS-2 Unicode thru 1995 was just fine. That people would continue to use the term "Unicode" to mean UCS-2 after Unicode 2.0 came out, when UCS-2 which was stuck on the Basic Multilingual Plane (BMP) was superseded by UTF-16 and UTF-8 as two of the encodings that could represent the full Unicode 2.0 and later character sets, is where I think the most confusion came about. I am not sure how THAT happened, and it confused me as well. It might be reluctance to change anything that might break backwards compatibility…
So I too watched this thinking "the whole thing is silly, UTF-8 is just by far the most popular encoding for the Unicode character set!" but realized that the twisted history of Unicode support and naming conventions in various languages has left a lot of people confused.

Reply
jvsnyc 29/08/2021 - 22:02

It's not nearly as complicated as all that. Unicode started coming together in ~1988, the first published standard was 1991. It wasn't until July 1996 that it stopped being the case that every single character fit into 16-bits. Everyone was using Unicode as a synonym for UCS-2 up until then. Even today, there are a lot of systems (fortunately fewer) that are still stuck on UCS-2 rather than UTF-8 or UTF-16, both of which can represent all the characters, not just the ones on the Basic Multilingual Plane (BMP) that UCS-2 can.

So calling UCS-2 "Unicode" was historically correct in 1988, 1989, … 1995, but since Unicode 2.0 in 1996 is now wrong and an anachronism. That is a subset of what we have today both in terms of which characters you can express and in features…

Reply
Pronto 29/08/2021 - 22:02

This video probably saved me several years of the journey you described. But I need to get off my phone and onto my computer to really understand this.

Reply
DarkSoldier 29/08/2021 - 22:02

Thanks pal, useful material. Go ahead.

Reply
Tiga Wu 29/08/2021 - 22:02

characters are bloated, ascii ftw

Reply
Balacon Enterprise 29/08/2021 - 22:02

what does UTF-8 stand for

Reply
조현민 29/08/2021 - 22:02

this actually did help a lot thanks

Reply
mz mz 29/08/2021 - 22:02

Why the text fonts on the computer does not work on the phone?

Reply
mr. Nobody 29/08/2021 - 22:02

My brain fucked up today for this stuff lookin' utf-8 encoding hexdecimal to decimal i didn't even try binary i was just tryin' todo simple XOR encryption in c++ and have no clue how i convert the xor grbage buffer output to BYTE integer array.. the Asci letters match but not the latin letters and some operators& emojis.

Reply
Nitin Garg 29/08/2021 - 22:02

Is it just me, or the video of Balmer, Gates and others cheering, dancing was funny enough to distract you?

Reply
Muskan Mahajan 29/08/2021 - 22:02

This video was beautiful, thank you!!

Reply
Old Яomans 29/08/2021 - 22:02

So I am taking content from an old pdf and creating a manual using a commercial "SD1000" compliant, framemaker ripoff software made by some Russian douchbag. The source files are some unknown pdf. In acrobat pro I go into text edit mode and copy the the info I need and past it into the publishing software. Once I finish, I publish my new document into pdf, but there is a major issue… wherever there is a word containg the letter combinations of 'fi' or 'fl', a # symbol appears. I contact Drago and tell him his software is failing. He goes on to explain to me about the unicode and utf-8 blah blah blah. IDGAF, but he says says that he would have to charge us a consulting fee to "fix" this issue. It's like I'm on his toll road and in the middle of the the road there is a cow and he wants me to pay to have him move his own cow off the road I'm paying him to be on. WTF!!! I can take the text to NotePad and then move it to the publishing software, but that sounds daunting. Is there a software that can "clean" the source pdf so this won't happen?

Reply
CAS 29/08/2021 - 22:02

This video actually really helped me understand unicode and UTF-8! You deserve way more subscribers!

Reply
Undying EDM 29/08/2021 - 22:02

Representation of the process of translating bytes to the glyphs we see (around 2:47) could have been explained better as some people might confuse what the arrows mean (not me). So if I got it right (hopefully) then f would be utf-8, x are the bytes (obviously) and y is the output in unicode. Either way, the next thing I want to point out are the cuts in audio and obvious change of how the voice sounds. This didn't bother me at all but in terms of video editing its something that can be improved. Towards the end I believe you should have recapped the difference between utf 8 vs unicode as one is a large set of symbols and the other is a system to represent those symbols with bytes in computers. Fun fact is I did some googling before playing this vid and for those who are getting started on learning about encodings, something very important to know is that encodings are important. They're different from code pages seen in windows (which represent character sets just like unicode). That's a detail for those who are working with cmd in windows, but other than that it's important to note that the output stream of a program may also have encoding (if we're talking about text) and if this stream of information is sent to a different program then the recipient must know the encoding to correctly interpret the information. Perhaps obvious or implied in the video, but what I just said is important to know when you're about to code something that processes text in an encoding other than ASCII. Anyways, great overall video, great introductory content and animations!

Reply
The Green Developer 29/08/2021 - 22:02

Super well made video, not gonna lie I wasn't expecting to watch through a video about Unicode and UTF-8, but it actually ended up clearing some stuff up for me. Great work!

Reply

Leave a Comment