tzikeh: (question - inquiry - bafflement)
[personal profile] tzikeh

Working from the supposition that all knowledge is contained in any random friendslist, I pose you this question:

If I wanted to research how to go about building a relatively simple online dictionary (probably using SQL and php), where would I go to find information about what kind of coding it would take, whether I'd need to lease an online thesaurus, etc.?

Date: 2007-03-07 07:23 pm (UTC)
ext_8855: (Default)
From: [identity profile] halcyon-shift.livejournal.com
It sort of depends on the kind of traffic you're expecting, and how you want the front end to work for the user (auto-completing words like google or whatnot). You'd want to think about cacheing the popular words to take the heat off the database as well

Date: 2007-03-07 07:32 pm (UTC)
ext_8855: (Default)
From: [identity profile] halcyon-shift.livejournal.com
Oh also http://www.gutenberg.org/dirs/etext96/pgwht04.txt ... don't know if that would be any use. Take some work to format it out as needed.

Date: 2007-03-07 07:44 pm (UTC)
From: [identity profile] tzikeh.livejournal.com
hm. Well, the traffic would likely be relatively light, and the front end would not auto-complete at all. It's for an English - Hidatsa gloss online dictionary (the Hidatsa language is dying out, and I think it has a total of 8700 words). The main problem, as I see it, is that if someone puts "happy" into the search, they'll get the Hidatsan gloss, but "joyful" or "pleased" or any of those will turn up nada.

Also the Hidatsa don't like white people much, so it's a touchy job. ;)

Date: 2007-03-07 07:49 pm (UTC)
From: [identity profile] tzikeh.livejournal.com
And here is their current website:

http://www.fbcc.bia.edu/

They want the dictionary to go on it. As you can see if you click around, they're not exactly web-savvy. So I need to know exactly what I'm talking about, and yet use de-techified words when explaining to them what it will entail. I don't even know if their host allows for SQL. (I'm assuming it does; most do at this point.)

Date: 2007-03-07 07:57 pm (UTC)
ext_8855: (Default)
From: [identity profile] halcyon-shift.livejournal.com
The way to do it, then, would be to have the Hidatsa words having a record each, and their translation(s) in another field. As the content wouldn't actually be changing often, you'd be good to use a fulltext search in SQL (which basically, if you're not up on it, searches through looking for matches)) rather than a LIKE search.

So to use your example, you'd have

hword eword
[Hidatsa word] [happy joyful pleased]

SELECT hword FROM words WHERE MATCH( eword ) AGAINST ( 'happy' IN BOOLEAN MODE)

That would return the Hidatsa word

Date: 2007-03-07 11:28 pm (UTC)
From: [identity profile] keiko-kirin.livejournal.com
I can't help you with the programming, but fwiw, there seems to be a 1911 ed. of Roget's Thesaurus that's in the public domain. I would confirm this before using it, but if true, you can probably find it online somewhere and harvest the synonyms from there.

I'll keep an eye open for better information. I have used many homegrown, simple online dictionaries for esoteric languages (Basque, Galician, and Breton come to mind) -- if you Google a bit for an obscure language's dictionary, you might find someone who's done it who explains how (ideal) or is willing to explain how. Omniglot.com links to dictionaries for many of the languages there; might be a place to start when looking for a model.

I feel sure it can be done in SQL, just based on the other kinds of SQL databases I've used.

Good luck!

Date: 2007-03-07 11:33 pm (UTC)
From: [identity profile] xenacryst.livejournal.com
Looks like their web host is running MS IIS (um, header analysis, yeah ;) I don't know if they've got any database available, but it's entirely possible that they don't. If they don't then the job is much harder, of course, and I really can't say diddly about IIS publishing. If they don't have a database availble, you could probably get by with an XML file, if you're willing to muck around with that, or even some sort of flat text file.

If they've got a database, then the web design and database should be relatively straightforward, once you figure out how you want to store the words in the database. Probably the harder part is managing database updates -- controlling who has access to that, whether it's even done over the web, or whatnot.

Yanno, it just occurred to me -- if they're using IIS, you could probably use an Access database. I tend to avoid Access like the plague, but it may be better than nothing...

Date: 2007-03-08 01:39 am (UTC)
From: [identity profile] corinna-5.livejournal.com
If your school has an MLS program, I would go ask them there for guidance on the information-management side of the problem, especially the thesauri issue. There would be at least one information-tools class in any MLS program worth the name; that's the prof you want.

Native American online dictionary

Date: 2007-03-08 02:21 am (UTC)
From: [identity profile] taverymate.livejournal.com
Not sure how large a project you're planning, and of course, you can scale it down depending on the resources you have available, but the development of an online dictionary for a Native American or First Nation language can be a hugely complicated endeavor - with technical web issues being but a small part.

One of my best friends is a professor of linguistics and her research area for the past twenty years has been a Northwestern Native American language. She also co-wrote and edited the first dictionary for that specific language. I just spoke with her and she said you could email her if you'd like. In particular, she's had enormous experience negotiating the difficulties of being a white, non-Native woman working with Native speakers on language issues over which they justifiably feel proprietary.

Email me at TaVeryMate at aol dot com and I'll give you her email.

Native American online dictionary resources

Date: 2007-03-08 02:48 am (UTC)
From: [identity profile] taverymate.livejournal.com
Again, not sure how many of the existing N.A. online dictionaries you've looked at already, but there are a variety of approaches and sizes. One increasing trend is to include sound files, especially important as first-language speakers are disappearing at an alarming rate and regional accents and dialects also need to be considered.

Some resources worth a look:

NativeWeb Resources - Native American Languages (Excellent site with good annotated lists of a wide variety of online dictionaries and other info):
http://www.nativeweb.org/resources/languages_linguistics/native_american_languages/

The American Indian Language Development Institute - hosted at U of Arizona
http://www.u.arizona.edu/~aildi/

Arizona Native American Online Dictionary Project
http://www.lexicon.arizona.edu/mikeserv/

Native Village Language Library
http://www.nativevillage.org/Libraries/Language%20Libraries.htm

Indigenous Language Institute
http://www.indigenous-language.org/

Native American Language Center - UC Davis
http://cougar.ucdavis.edu/nas/nalc

Amican Indian Language Resources (Particularly useful list of available downloadable fonts for various Native American languages. Also the section on Dictionaries, Fonts and Specific Languages According to Family & the section on Online Language Materials and Language Lessons are wonderful)
http://www2005.lang.osaka-u.ac.jp/~krkvls/lang.html

Good Luck!

Profile

tzikeh: (Default)
tzikeh

August 2022

S M T W T F S
 123456
78910111213
14151617181920
21222324252627
282930 31   

Style Credit

Expand Cut Tags

No cut tags
Page generated Feb. 28th, 2026 04:25 pm
Powered by Dreamwidth Studios