Topic: Computer science (Page 2)

You are looking at all articles with the topic "Computer science". We found 125 matches.

Hint: To view all topics, click here. Too see the most popular topics, click here instead.

🔗 Levenshtein Distance
🛈

🔗 Computer science

In information theory, linguistics and computer science, the Levenshtein distance is a string metric for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other. It is named after the Soviet mathematician Vladimir Levenshtein, who considered this distance in 1965.

Levenshtein distance may also be referred to as edit distance, although that term may also denote a larger family of distance metrics known collectively as edit distance. It is closely related to pairwise string alignments.

Discussed on

"Levenshtein Distance" | 2019-08-22 | 215 Upvotes 126 Comments

🔗 CRDT: Conflict-free replicated data type
🛈

🔗 Computer science

In distributed computing, a conflict-free replicated data type (CRDT) is a data structure which can be replicated across multiple computers in a network, where the replicas can be updated independently and concurrently without coordination between the replicas, and where it is always mathematically possible to resolve inconsistencies which might result.

The CRDT concept was formally defined in 2011 by Marc Shapiro, Nuno Preguiça, Carlos Baquero and Marek Zawirski. Development was initially motivated by collaborative text editing and mobile computing. CRDTs have also been used in online chat systems, online gambling, and in the SoundCloud audio distribution platform. The NoSQL distributed databases Redis, Riak and Cosmos DB have CRDT data types.

Discussed on

"CRDT: Conflict-free replicated data type" | 2019-11-06 | 229 Upvotes 63 Comments

🔗 Jaccard Index
🛈

🔗 Computer science 🔗 Statistics

The Jaccard index, also known as the Jaccard similarity coefficient, is a statistic used for gauging the similarity and diversity of sample sets. It was developed by Grove Karl Gilbert in 1884 as his ratio of verification (v) and now is frequently referred to as the Critical Success Index in meteorology. It was later developed independently by Paul Jaccard, originally giving the French name coefficient de communauté, and independently formulated again by T. Tanimoto. Thus, the Tanimoto index or Tanimoto coefficient are also used in some fields. However, they are identical in generally taking the ratio of Intersection over Union. The Jaccard coefficient measures similarity between finite sample sets, and is defined as the size of the intersection divided by the size of the union of the sample sets:

J(A,B)={{|A\cap B|} \over {|A\cup B|}}={{|A\cap B|} \over {|A|+|B|-|A\cap B|}}.

Note that by design, $0\leq J(A,B)\leq 1.$ If A intersection B is empty, then J(A,B) = 0. The Jaccard coefficient is widely used in computer science, ecology, genomics, and other sciences, where binary or binarized data are used. Both the exact solution and approximation methods are available for hypothesis testing with the Jaccard coefficient.

Jaccard similarity also applies to bags, i.e., Multisets. This has a similar formula, but the symbols mean bag intersection and bag sum (not union). The maximum value is 1/2.

J(A,B)={{|A\cap B|} \over {|A\uplus B|}}={{|A\cap B|} \over {|A|+|B|}}.

The Jaccard distance, which measures dissimilarity between sample sets, is complementary to the Jaccard coefficient and is obtained by subtracting the Jaccard coefficient from 1, or, equivalently, by dividing the difference of the sizes of the union and the intersection of two sets by the size of the union:

d_{J}(A,B)=1-J(A,B)={{|A\cup B|-|A\cap B|} \over |A\cup B|}.

An alternative interpretation of the Jaccard distance is as the ratio of the size of the symmetric difference $A\triangle B=(A\cup B)-(A\cap B)$ to the union. Jaccard distance is commonly used to calculate an n × n matrix for clustering and multidimensional scaling of n sample sets.

This distance is a metric on the collection of all finite sets.

There is also a version of the Jaccard distance for measures, including probability measures. If $\mu$ is a measure on a measurable space $X$ , then we define the Jaccard coefficient by

J_{\mu }(A,B)={{\mu (A\cap B)} \over {\mu (A\cup B)}},

and the Jaccard distance by

d_{\mu }(A,B)=1-J_{\mu }(A,B)={{\mu (A\triangle B)} \over {\mu (A\cup B)}}.

Care must be taken if $\mu (A\cup B)=0$ or $\infty$ , since these formulas are not well defined in these cases.

The MinHash min-wise independent permutations locality sensitive hashing scheme may be used to efficiently compute an accurate estimate of the Jaccard similarity coefficient of pairs of sets, where each set is represented by a constant-sized signature derived from the minimum values of a hash function.

Discussed on

"Jaccard Index" | 2023-03-19 | 252 Upvotes 43 Comments

🔗 Stochastic Parrot
🛈

🔗 Computer science 🔗 Philosophy 🔗 Philosophy/Contemporary philosophy 🔗 Philosophy/Philosophy of mind 🔗 Artificial Intelligence

In machine learning, "stochastic parrot" is a term coined by Emily M. Bender in the 2021 artificial intelligence research paper "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?" by Bender, Timnit Gebru, Angelina McMillan-Major, and Margaret Mitchell. The term refers to "large language models that are impressive in their ability to generate realistic-sounding language but ultimately do not truly understand the meaning of the language they are processing."

Discussed on

"Stochastic Parrot" | 2023-06-13 | 125 Upvotes 161 Comments

🔗 Rete Algorithm
🛈

🔗 Computing 🔗 Computer science

The Rete algorithm ( REE-tee, RAY-tee, rarely REET, reh-TAY) is a pattern matching algorithm for implementing rule-based systems. The algorithm was developed to efficiently apply many rules or patterns to many objects, or facts, in a knowledge base. It is used to determine which of the system's rules should fire based on its data store, its facts. The Rete algorithm was designed by Charles L. Forgy of Carnegie Mellon University, first published in a working paper in 1974, and later elaborated in his 1979 Ph.D. thesis and a 1982 paper.

Discussed on

"Rete Algorithm" | 2024-05-26 | 216 Upvotes 54 Comments

🔗 One Instruction Set Computer
🛈

🔗 Computing 🔗 Computer science

A one-instruction set computer (OISC), sometimes called an ultimate reduced instruction set computer (URISC), is an abstract machine that uses only one instruction – obviating the need for a machine language opcode. With a judicious choice for the single instruction and given infinite resources, an OISC is capable of being a universal computer in the same manner as traditional computers that have multiple instructions. OISCs have been recommended as aids in teaching computer architecture and have been used as computational models in structural computing research.

Discussed on

"One Instruction Set Computer" | 2019-12-07 | 54 Upvotes 26 Comments
"One instruction set computer" | 2015-04-16 | 63 Upvotes 25 Comments
"One Instruction Set Computer" | 2015-04-07 | 12 Upvotes 1 Comments
"One instruction set computer" | 2011-09-28 | 69 Upvotes 18 Comments

🔗 Soundex – a phonetic algorithm for indexing names by sound
🛈

🔗 Computer science 🔗 Linguistics 🔗 Linguistics/Applied Linguistics

Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling. The algorithm mainly encodes consonants; a vowel will not be encoded unless it is the first letter. Soundex is the most widely known of all phonetic algorithms (in part because it is a standard feature of popular database software such as DB2, PostgreSQL, MySQL, SQLite, Ingres, MS SQL Server and Oracle.) Improvements to Soundex are the basis for many modern phonetic algorithms.

Discussed on

"Soundex – a phonetic algorithm for indexing names by sound" | 2019-10-20 | 106 Upvotes 41 Comments
"Soundex" | 2018-05-25 | 95 Upvotes 26 Comments

🔗 Duffs Device
🛈

🔗 Computing 🔗 Computer science 🔗 C/C++ 🔗 C/C++/C

In the C programming language, Duff's device is a way of manually implementing loop unrolling by interleaving two syntactic constructs of C: the do-while loop and a switch statement. Its discovery is credited to Tom Duff in November 1983, when Duff was working for Lucasfilm and used it to speed up a real-time animation program.

Loop unrolling attempts to reduce the overhead of conditional branching needed to check whether a loop is done, by executing a batch of loop bodies per iteration. To handle cases where the number of iterations is not divisible by the unrolled-loop increments, a common technique among assembly language programmers is to jump directly into the middle of the unrolled loop body to handle the remainder. Duff implemented this technique in C by using C's case label fall-through feature to jump into the unrolled body.

Discussed on

"Duff's Device" | 2020-05-25 | 145 Upvotes 59 Comments
"Duffs Device" | 2016-10-21 | 19 Upvotes 4 Comments
"Duff's device: do + switch fall-through = loop unrolling" | 2009-08-13 | 16 Upvotes 11 Comments

🔗 Negative Base
🛈

🔗 Computer science 🔗 Mathematics

A negative base (or negative radix) may be used to construct a non-standard positional numeral system. Like other place-value systems, each position holds multiples of the appropriate power of the system's base; but that base is negative—that is to say, the base b is equal to −r for some natural number r (r ≥ 2).

Negative-base systems can accommodate all the same numbers as standard place-value systems, but both positive and negative numbers are represented without the use of a minus sign (or, in computer representation, a sign bit); this advantage is countered by an increased complexity of arithmetic operations. The need to store the information normally contained by a negative sign often results in a negative-base number being one digit longer than its positive-base equivalent.

The common names for negative-base positional numeral systems are formed by prefixing nega- to the name of the corresponding positive-base system; for example, negadecimal (base −10) corresponds to decimal (base 10), negabinary (base −2) to binary (base 2), negaternary (base −3) to ternary (base 3), and negaquaternary (base −4) to quaternary (base 4).

Discussed on

"Negative Base" | 2018-06-27 | 177 Upvotes 69 Comments

🔗 Dennis Ritchie
🛈

🔗 Biography 🔗 Computing 🔗 Computer science 🔗 Biography/science and academia 🔗 New York (state) 🔗 New York (state)/Hudson Valley 🔗 Computing/Computer science 🔗 Software 🔗 Software/Computing 🔗 C/C++ 🔗 Japan 🔗 New Jersey 🔗 Linux

Dennis MacAlistair Ritchie (September 9, 1941 – c. October 12, 2011) was an American computer scientist. He created the C programming language and, with long-time colleague Ken Thompson, the Unix operating system and B programming language. Ritchie and Thompson were awarded the Turing Award from the ACM in 1983, the Hamming Medal from the IEEE in 1990 and the National Medal of Technology from President Bill Clinton in 1999. Ritchie was the head of Lucent Technologies System Software Research Department when he retired in 2007. He was the "R" in K&R C, and commonly known by his username dmr.

Discussed on

"Dennis Ritchie" | 2011-10-13 | 243 Upvotes 1 Comments